labgem / PPanGGOLiN

Build a partitioned pangenome graph from microbial genomes
https://ppanggolin.readthedocs.io
Other
239 stars 28 forks source link

error while running ppanggolin workflow --fasta testingDataset/organisms.fasta.list command. #49

Closed dineshkumarsrk closed 3 years ago

dineshkumarsrk commented 3 years ago

I have installed ppanggolin in my ubuntu 18.04.5 LTS 64 bit version computer. I have followed all the instruction and successfully installed ppanggolin. I have checked the version of ppanggolin which shows as follows, (base) dinesh@dinesh7k:~$ ppanggolin -v ppanggolin 1.1.96 I ran the following command in order to check the test data provided by you in the sourc code 1.1.96 ppanggolin workflow --fasta testingDataset/organisms.fasta.list
The above command displays error as follows, (base) dinesh@dinesh7k:~/Documents/tools/PPanGGOLiN-1.1.96$ ppanggolin workflow --fasta testingDataset/organisms.fasta.list Traceback (most recent call last): File "/home/dinesh/anaconda3/bin/ppanggolin", line 10, in <module> sys.exit(main()) File "/home/dinesh/anaconda3/lib/python3.7/site-packages/ppanggolin/main.py", line 157, in main checkInputFiles(fasta = args.fasta) File "/home/dinesh/anaconda3/lib/python3.7/site-packages/ppanggolin/main.py", line 70, in checkInputFiles checkTsvSanity(fasta) File "/home/dinesh/anaconda3/lib/python3.7/site-packages/ppanggolin/main.py", line 50, in checkTsvSanity raise Exception(f"Some of the given files do not exist. The non-existing files are the following : '{' '.join(nonExistingFiles)}'") Exception: Some of the given files do not exist. The non-existing files are the following : 'FASTA/GCF_001317785.1_7396_3_13_genomic.fna.gz FASTA/GCF_001729845.1_ASM172984v1_genomic.fna.gz FASTA/GCF_003788895.1_sc110_genomic.fna.gz FASTA/GCF_000318785.1_ASM31878v1_genomic.fna.gz FASTA/GCF_002777155.1_ASM277715v1_genomic.fna.gz FASTA/GCF_001183765.1_ASM118376v1_genomic.fna.gz FASTA/GCF_000026905.1_ASM2690v1_genomic.fna.gz FASTA/GCF_000220105.1_ASM22010v1_genomic.fna.gz FASTA/GCF_001398215.1_7501_6_50_genomic.fna.gz FASTA/GCF_000318825.1_ASM31882v1_genomic.fna.gz FASTA/GCF_002777095.1_ASM277709v1_genomic.fna.gz FASTA/GCF_000318545.1_ASM31854v1_genomic.fna.gz FASTA/GCF_000318805.1_ASM31880v1_genomic.fna.gz FASTA/GCF_000092685.1_ASM9268v1_genomic.fna.gz FASTA/GCF_001183825.1_ASM118382v1_genomic.fna.gz FASTA/GCF_002776955.1_ASM277695v1_genomic.fna.gz FASTA/GCF_001213045.1_5082_8_5_genomic.fna.gz FASTA/GCF_000304515.1_Cm_FSW4_genomic.fna.gz FASTA/GCF_000092665.1_ASM9266v1_genomic.fna.gz FASTA/GCF_001398135.1_7501_6_52_genomic.fna.gz FASTA/GCF_000472205.1_E_CS88_f__genomic.fna.gz FASTA/GCF_000318945.1_ASM31894v1_genomic.fna.gz FASTA/GCF_000318865.1_ASM31886v1_genomic.fna.gz FASTA/GCF_002776935.1_ASM277693v1_genomic.fna.gz FASTA/GCF_001293965.1_ASM129396v1_genomic.fna.gz FASTA/GCF_001398295.1_7396_3_21_genomic.fna.gz FASTA/GCF_006508235.1_ASM650823v1_genomic.fna.gz FASTA/GCF_000093005.1_ASM9300v1_genomic.fna.gz FASTA/GCF_000318925.1_ASM31892v1_genomic.fna.gz FASTA/GCF_000590575.1_ASM59057v1_genomic.fna.gz FASTA/GCF_001729905.1_ASM172990v1_genomic.fna.gz FASTA/GCF_000226605.1_ASM22660v1_genomic.fna.gz FASTA/GCF_000590695.1_ASM59069v1_genomic.fna.gz FASTA/GCF_001183845.1_ASM118384v1_genomic.fna.gz FASTA/GCF_001183805.1_ASM118380v1_genomic.fna.gz FASTA/GCF_002192615.1_ASM219261v1_genomic.fna.gz FASTA/GCF_002777115.1_ASM277711v1_genomic.fna.gz FASTA/GCF_002776885.1_ASM277688v1_genomic.fna.gz FASTA/GCF_003788785.1_ct114V1_genomic.fna.gz FASTA/GCF_000319105.1_ASM31910v1_genomic.fna.gz FASTA/GCF_000210495.1_ASM21049v1_genomic.fna.gz FASTA/GCF_000441775.1_ASM44177v1_genomic.fna.gz FASTA/GCF_000173495.1_ASM17349v1_genomic.fna.gz FASTA/GCF_006508185.1_ASM650818v1_genomic.fna.gz FASTA/GCF_002777015.1_ASM277701v1_genomic.fna.gz FASTA/GCF_002088315.1_ASM208831v1_genomic.fna.gz FASTA/GCF_002776845.1_ASM277684v1_genomic.fna.gz FASTA/GCF_000441655.1_ASM44165v1_genomic.fna.gz FASTA/GCF_000590635.1_ASM59063v1_genomic.fna.gz FASTA/GCF_006508265.1_ASM650826v1_genomic.fna.gz' Please help me to fix this issue.

axbazin commented 3 years ago

Hi,

The tool looks for them in the working directory, so if the files are not in 'FASTA/' where you execute PPanGGOLiN, it will not find them. Looking at where you are executing PPanGGOLiN, they are in testingDataset/FASTA/.

If you put yourself in 'testingDataset/' and then run 'ppanggolin workflow --fasta organisms.fasta.list' it should work nicely.

For your personnal usage, using absolute paths in the 'organisms.fasta.list' file is probably better, this is what I usually do with my own data.

Adelme

dineshkumarsrk commented 3 years ago

Thank you @axbazin for your prompt response, now it works well. I have another query, Is it possible to work on viral pangenome analysis with this ppanggolin tool?

axbazin commented 3 years ago

Great !

Originally it was designed for Archaea and Bacteria only, but another group managed to reconstruct pangenomes of phages / plasmids with PPanGGOLiN (see https://www.biorxiv.org/content/10.1101/2020.11.09.375378v1 ), so it is possible.

Beware however that the annotation tool used by ppanggolin is prodigal, so it may not be adapted to viruses and you might want to use your own annotations (--anno option rather than --fasta option for the input files, and using gff or gbff files instead of fasta files) And parameters were chosen for bacterial species pangenomes so you might want to adapt the parameters as well, however I lack knowledge in viral genomics to know what would or would not be good there.