Closed luisruis closed 8 months ago
could you please share the log file with me? my guess is that your genomes are very distantly related and the software was unable to calculate a core genome. To verify this, there should be a file called makeSpeciesTreeWorkDir/aabrhHardCore_concatenated.afa
. If this file is empty, then this is your problem.
the phantasm.log file has this content:
INFO:main:/phantasm/phantasm analyzeGenomes -i Refine_Genome_Caulobacter -m human_map.txt -e lxxxxxxxxx@cxxxxxxxx.mx INFO:main:v1.1.0 INFO:main:num cpus: 1 INFO:main:reduce num core: False INFO:main:bootstrap tree: False INFO:main:num bootstraps: 0
INFO:main:start analyzeGenomes
INFO:PHANTASM.coreGenes.parseGenbank:Parsing genbank files ... INFO:PHANTASM.coreGenes.parseGenbank:Done.
INFO:PHANTASM.coreGenes.allVsAllBlast:Running all pairwise blastp comparisons ... INFO:PHANTASM.coreGenes.allVsAllBlast:Done.
INFO:PHANTASM.coreGenes.calculateCoreGenes:Calculating core genes ... INFO:PHANTASM.coreGenes.calculateCoreGenes:Done.
INFO:PHANTASM.coreGenes.makeSpeciesTree:Aligning core genes ... INFO:PHANTASM.coreGenes.makeSpeciesTree:Done.
the aabrhHardCore_concatenated.afa file if it is empty. The genomes that I want to analyze are from the same bacterial genus, only the outgroup is a genome from a strain of a different genus. In this case, would Phantasm have problems analyzing genomes of the same taxonomic genus?
are you able to share your genomes with me? Either your outgroup is too distantly related (should not be the case if they're in the same taxonomic family), or one or more of your genomes is not annotated properly. If it was the latter, then this could be why no core genes were detected. PHANTASM relies on annotations in order to extract the coding sequences from your genomes.
Hello Dr. Joe. Does my problem have a solution? :(
Hello Luis,
I have not had a chance to investigate this problem yet. I will get back to you by the end of the week.
A quick glance at your files reveals that several of your genomes are improperly formatted:
* Caulobacter_sp_AfrMine_TT107_68_72.gbff
* Caulobacter_vibrioides_UBA2596.gbff
* Caulobacter_sp_JGI_0001010-J14.gbff
* Caulobacter_vibrioides_GCA_951805235.gbff
* Caulobacter_rhizosphaerae_KCTC_52515.gbff
The warnings you received was BioPython telling you that something is wrong with those files. This is likely why your run failed. Remove those genomes and try again.
Hello Dr. Joe,
I already removed the genomes that biopython identified as bad annotations, but I still don't get the final files. Phantasm generates all the blast and fasta files that are obtained by comparing all the genomes with each other, but the aabrhHardCore.out file is empty. The aabrhHardCore_concatenated.afa and aabrhHardCoreFamToGeneKey.txt files that are generated in the makeSpeciesTreeworDir folder are also empty. What changes should I make?
I put the image of what appears to me after the 10,404 blast files and the 816 fasta of the genomes are generated. In total I am analyzing 102 genome annotations.
As before, please upload the log file. If aabrhHardCore.out
is empty, then xenoGI
(the software package used to calculate core genes) is not finding any core genes.
It is difficult to pinpoint exactly which genome(s) is causing the problem without examining the blastp tables. I recommend checking those files to find out which genome(s) lacks good hits to any other genome(s). The cut-offs are described in the publication.
In the meantime, I will reopen this issue handle this error in a way that makes it clear the set of specified genomes are incompatible with PHANTASM.
Hello Dr. Joe,
It's me, again. I already managed to run Option 3: known reference genomes. The program was analyzing and comparing my genomes these days, but today, almost to finish the analysis, the following error appeared: