Kinggerm / GetOrganelle

Organelle Genome Assembly Toolkit (Chloroplast/Mitocondrial/ITS)
GNU General Public License v3.0
267 stars 51 forks source link

Slimming failed and no valid assembly graph found #301

Open poojancf opened 10 months ago

poojancf commented 10 months ago

I am trying to extract animal mitochondrial genome from short reads data using getorganelle. the script is- get_organelle_from_reads.py -1 /home/pragya/hornbill/great/NZH/analysis/fastp/NZH_illumina_trimmed_R1.fastq.gz -2 /home/pragya/hornbill/great/NZH/analysis/fastp/NZH_illumina_trimmed_R2.fastq.gz -R 10 -k 21,45,65,85,105 -F animal_mt -o /home/pragya/hornbill/great/NZH/analysis/mitogenome/mt_out

the error I run into is ERROR: Slimming on checking the corresponding text file- it says ERROR: BLAST Database error: Error pre-fetching sequence data

following the slimming error, another error is - ERROR: No valid assembly graph found! I am guessing the second error is because of the first error that hasn't been addressed.

It would be great to get your help in troubleshooting this.

Thanks and regards, Pooja

JianjunJin commented 9 months ago

I am guessing the second error is because of the first error that hasn't been addressed - I think so. However, as always, please attach the log file - not just the error lines - helping yourself, helping others and helping me. I didn't see the many important information for troubleshooting.

I would recommend clearing the previously downloaded database and reinstalling it. Make sure no errors occur in the following steps.

get_organelle_config.py --rm animal_mt
get_organelle_config.py --add animal_mt
baileyjc commented 4 months ago

Hello, I ran into the same problem as the user above. get_org.log.txt file:

ERROR: Slimming /PATH/K127/assembly_graph.fastg failed. Please check /PATH/K127/slim.log.txt for details.
ERROR: No valid assembly graph found!

slim.log.txt file: ERROR: BLAST Database error: Error pre-fetching sequence data

Attached are my log, shell script, and slurm files. The separate shell file is somewhat unnecessary but I do it to keep things organized. The assembly_graph.fastg that was created produces an image that looks pretty much like the app image of the Bandage app but much more chaotic and disorganized. get_org.log.txt slim.log.txt get_mtDNA.slurm.txt get_mtDNA.sh.txt

I apologize if this issue has already been dealt with. I looked up the error and could not find the exact same error but did see somewhat similar issues recommending to redownload nblast. Maybe that's the issue but thought I would add to the conversation in the meantime while I try a few additional ways to solve this.

baileyjc commented 4 months ago

I removed and readded "animal_mt" as mentioned above but it returns the same error as mentioned above.

JianjunJin commented 4 months ago

@baileyjc Thanks for providing such details. I'm not totally sure what is going on. Have you tried upgrading blast to 2.11+? It seems that this database error can also be caused by the incompatibility between different blast versions .. Usually people would not have different versions for makeblastdb and the query binary, just in case. Let me know you find further information to share on this issue.

# BTW, you can always use --continue to resume the previous run so further tests can be faster.

baileyjc commented 4 months ago

Thank you @JianjunJin for your reply! I will upgrade ncbi-blast to see if that resolves this issue! Thank you also for the recommendation of using "--continue", that will definitely save me some time!

baileyjc commented 4 months ago

Hi @JianjunJin, after using a newer version of ncbi-blast and using the --continue option the run finished. The reason for the older program versions is because I could not download GetOrganelle on my computer or the cluster I work on using conda so I used curl to download GetOrganelleDep which has those older package versions. Unfortunately, a circular mt genome could not be created so now I have my own problems to tackle but that's not an error of your package! I am a little surprised since I used the same files that were used to create the reference genome for my organism by the Vertebrate Genome Project (i.e. lots of depth and coverage; for some reason they did not pull out the mtDNA for the genome) but I have some additional ideas of how I plan to tackle this. Have a great day and thank you again for the help!