dr-joe-wirth / phantasm

PHANTASM: PHylogenomic ANalyses for the TAxonomy and Systematics of Microbes
MIT License
23 stars 0 forks source link

human_map.txt file and outgroup #15

Closed bharat1912 closed 1 year ago

bharat1912 commented 1 year ago

Hi Joe I have run analyzeGenomes with 43 gbff files downloaded from ncbi, as given below. I checked the blast dir. Files 1_-VS-35.out and 1-VS-37.out are listed but the file 1-VS-36.out is missing as reported in the last line of the error report. On checking further, all files XX-VS-_36.out are missing.

Of the list of genomes (numbered 1 to 43) I had nominated number 36 as an outgroup in the human_map.txt file and as number 36 has not been compared with any of the other 42 gbff sequences, I am assuming that the outgroup cannot be duplicated and has to be a unique member in the human_map.txt file

Thanks

Bharat

$sudo docker run -v $(pwd):/data jwirth/phantasm:latest phantasm analyzeGenomes -i myworkdir/ -m human_map.txt -N 4 -O ana_genomes -e XXXX@XXX.XXX

If you use this software in your research, please cite our paper: Automating microbial taxonomy workflows with PHANTASM: PHylogenomic ANalyses for the TAxonomy and Systematics of Microbes Joseph S. Wirth and Eliot C. Bush, 2023 https://doi.org/10.1093/nar/gkad196

Parsing genbank files ... Done. Running all pairwise blastp comparisons ... Done. Calculating core genes ... Traceback (most recent call last): File "/phantasm/phantasm", line 162, in analyzeSpecifiedGenomes(genomesL, paramO) File "/phantasm/PHANTASM/main.py", line 57, in analyzeSpecifiedGenomes coreGenesWrapper_1(paramO) File "/phantasm/PHANTASM/main.py", line 126, in coreGenesWrapper_1 calculateCoreGenes(paramO_1) File "/phantasm/PHANTASM/coreGenes.py", line 357, in calculateCoreGenes xenoGI.scores.createAabrhL(strainNamesL, File "/xenoGI-3.1.1/xenoGI/scores.py", line 214, in createAabrhL rHitsL=getAllReciprocalHits(strainNamesL,blastFileJoinStr,blastDir,blastExt,evalueThresh,alignCoverThresh,percIdentThresh) File "/xenoGI-3.1.1/xenoGI/scores.py", line 255, in getAllReciprocalHits blastD,strainPairL = blast.createBlastD(strainNamesL,blastFileJoinStr,blastDir,blastExt,evalueThresh,alignCoverThresh,percIdentThresh) File "/xenoGI-3.1.1/xenoGI/blast.py", line 166, in createBlastD blastD[strainPair] = parseBlastFile(blastPath,evalueThresh,alignCoverThresh,percIdentThresh) File "/xenoGI-3.1.1/xenoGI/blast.py", line 194, in parseBlastFile with open(blastFN,'r') as f: FileNotFoundError: [Errno 2] No such file or directory: '/data/anagenomes/blast/1-VS-_36.out'

dr-joe-wirth commented 1 year ago

Could you please zip the directory containing the genomes, map file, and the results and then share it with me? Also, the file phantasm.log

This should not be happening.

bharat1912 commented 1 year ago

The zip file sizes are too large (95M + 476 M) and tar.gz files are not accepted. Can I send the files using the link https://wetransfer.com/ to an email address Thanks Bharat

bharat1912 commented 1 year ago

Hi Joe, I deleted the docker file and reinstalled phantasm (native installation) and than reran the same data files (which I hope you have been able to download a copy of by now). The same error was repeated with the native installation (below):

Kind regards Bharat

$python /home/bharat/opt/phantasm/phantasm.py analyzeGenomes -i myworkdir -m human_map.txt -e bharatgu19@gmail.con

If you use this software in your research, please cite our paper: Automating microbial taxonomy workflows with PHANTASM: PHylogenomic ANalyses for the TAxonomy and Systematics of Microbes Joseph S. Wirth and Eliot C. Bush, 2023 https://doi.org/10.1093/nar/gkad196

Parsing genbank files ... Done. Running all pairwise blastp comparisons ... Done. Calculating core genes ... Traceback (most recent call last): File "/home/bharat/opt/phantasm/phantasm.py", line 161, in analyzeSpecifiedGenomes(genomesL, paramO) File "/home/bharat/opt/phantasm/PHANTASM/main.py", line 57, in analyzeSpecifiedGenomes coreGenesWrapper_1(paramO) File "/home/bharat/opt/phantasm/PHANTASM/main.py", line 126, in coreGenesWrapper_1 calculateCoreGenes(paramO_1) File "/home/bharat/opt/phantasm/PHANTASM/coreGenes.py", line 357, in calculateCoreGenes xenoGI.scores.createAabrhL(strainNamesL, File "/home/bharat/.local/lib/python3.9/site-packages/xenoGI/scores.py", line 214, in createAabrhL rHitsL=getAllReciprocalHits(strainNamesL,blastFileJoinStr,blastDir,blastExt,evalueThresh,alignCoverThresh,percIdentThresh) File "/home/bharat/.local/lib/python3.9/site-packages/xenoGI/scores.py", line 255, in getAllReciprocalHits blastD,strainPairL = blast.createBlastD(strainNamesL,blastFileJoinStr,blastDir,blastExt,evalueThresh,alignCoverThresh,percIdentThresh) File "/home/bharat/.local/lib/python3.9/site-packages/xenoGI/blast.py", line 166, in createBlastD blastD[strainPair] = parseBlastFile(blastPath,evalueThresh,alignCoverThresh,percIdentThresh) File "/home/bharat/.local/lib/python3.9/site-packages/xenoGI/blast.py", line 194, in parseBlastFile with open(blastFN,'r') as f: FileNotFoundError: [Errno 2] No such file or directory: '/home/bharat/Anoxybacillusphantasm/finalAnalysis/blast/1-VS-_36.out'

bharat1912 commented 1 year ago

Hi Joe, Apologies but I forgot to add that I still run bionic (ubuntu 18.05) as I have an older NVidia card which is incompatible with displays if I upgrade to Ubunutu > 18.04.

I have two versions of python3 installed in ubuntu 18.04: $python3 -V: Python 3.6.9 AND $python -V: Python 3.9.11

I have used python (Python 3.9.11) to run the native installation of phantasm

dr-joe-wirth commented 1 year ago

Hello,

This definitely wasn't an issue with the docker image, so I'm not surprised you got the same error running the native installation. As I mentioned before I can't easily diagnose the problem from here without at least seeing the input data. Since you can't provide me with the results, can you instead share the 43 accession numbers so that I can download the genomes and attempt to recreate the error on my end?

bharat1912 commented 1 year ago

Hi Joe, The list of accession numbers accession_list.txt and the human_map.txt files are attached human_map.txt

Regards Bharat

bharat1912 commented 1 year ago

One.gff in the humanmap.txt = GCF_000833605_1.gbff in the accession_list.tx

bharat1912 commented 1 year ago

Hi Joe, I have resolved the issue and am closing the file.

Thanks for your assistance. Very much appreciated.

Bharat

dr-joe-wirth commented 1 year ago

I'm glad you figured it out. Would you be willing to share what you did to solve the problem?