hwc2021 / GSAT

Graph-based Sequence Assembly Toolkit
MIT License
20 stars 6 forks source link

The mrg.filtered.gfa file was not generated #5

Open IntronQ opened 1 year ago

IntronQ commented 1 year ago

Hello, I use 5Gb NGS and 5GB ONT data for plant MT assembly. The graphShort pipeline runs normally and generates the og.filtered.gfa file.

1、I ran with the command: gsat graphLong -conf sample.conf

When the program finished, only 4 output file were generated: image

As you can see without mrg.filtered.gfa file, and there is nothing error information reported.

2、As you mentioned “Please note that the params for graphShort should be adjusted for different species, so the resulted OG should be checked and edited in Bandage before it can be used for the next pipeline.” I have no idea how to checke or edit the OG in Bandage, what are the criteria? Coverage depth? length? or any other information we should take into account?

How do I resolve this issue? Looking forward to your reply! Thank you!

awesomedeer commented 1 year ago

Any updates? @IntronQ

hwc2021 commented 1 year ago

Hi!

Sorry for taking so long to reply. This seems a common issue for many users. I will try to resolve this problem in the next version.

  1. Actually, this happens when the program consumes too much memory and is killed by the system. The memory usage of GSAT will increase rapidly when the file og.filtered.gfa is large, e.g., > 1Mb. Thus it is very important to check and edit the og.filtered.gfa in Bandage to remove nuclear-original sequences before this file can be further used in the next step.
  2. There are some criteria that could be used for improving the og.filtered.gfa in Bandge software, including coverage depth (You can also modify the minDep1 and minDep2 params in example.conf) and connections to the target sequence. Ideally, some target sequences can be simply judged by depth information. When the graph is more complex, the target sequence can also be determined by comparing with the reference organelle genome. After those processes, the og.filtered.gfa could be improved a lot and should be much smaller.

Please try the above suggestions, which should help solve your problem.