Kinggerm / GetOrganelle

Organelle Genome Assembly Toolkit (Chloroplast/Mitocondrial/ITS)
GNU General Public License v3.0
255 stars 50 forks source link

How to deal with the "Disentangling failed" and "Assembly based on scaffolding may not be as accurate as the ones directly exported from the assembly graph" warnings. #101

Open SYSU-SinLee opened 2 years ago

SYSU-SinLee commented 2 years ago

Dear Dr. Jin,

Thanks for your efforts to develop and maintain the software. I'm new to the field of plant organelle assembly and I met some problems using GetOrganelle.

I tried to assemble one chloroplast genome and started by using some simple settings. nohup get_organelle_from_reads.py -1 read_1.clean.fq.gz -2 read_2.clean.fq.gz -t 10 -o organelle -F embplant_pt -R 20 -k 13,21,33,55,85 &

There were some warnings in the log file get_org.log1.txt, and the assembly was broken. graph

Then I changed the settings as suggested in the FAQ. Through trial and error, I finally used the following command. nohup get_organelle_from_reads.py -1 read_1.clean.fq.gz -2 read_2.clean.fq.gz -t 5 -o organelle5 -F embplant_pt --max-reads 1E15 -R 50 -k 13,21,33,45,65,85 -s JB.fa --disentangle-time-limit 7200 -w 40 2>>try5.err &

There are still warining as mentioned in the title in the log fileget_org.log2.txt. And I'm wondering what should I do to deal with the warnings like 'Disentangling failed: 'Incomplete/Complicated graph: please check around EDGE_11933377!' and get a complete assembly.

graph

I'll appreciate it if you have any suggestions.

Best regards, Li Sen

Kinggerm commented 2 years ago

to use all reads, set '--reduce-reads-for-coverage inf --max-reads inf'

SYSU-SinLee commented 2 years ago

to use all reads, set '--reduce-reads-for-coverage inf --max-reads inf'

Sorry for replying late because it cost a long time for the software to finish. I followed your suggestion and use the command as follow: nohup get_organelle_from_reads.py -1 read_1.clean.fq.gz -2read_2.clean.fq.gz -t 10 -o organelle6 -F embplant_pt --max-reads inf --reduce-reads-for-coverage inf -R 100 -k 13,21,33,45,65,85 -s ref.fa --disentangle-time-limit 10800 2>>getorgan.err &

Then I got much more warnings in the log file get_org.log.txt and the assembly seemed very weird.

Should I reduce the input reads? Or can I use the paired reads file generated in the last round to skip the preparation process before assembly? graph2

Look forward to your reply and best regards.

Li Sen

Kinggerm commented 2 years ago

You should reduce the input reads. Check the illustration of --reduce-reads-for-coverage and --max-reads.