bcgsc / ChopStitch

Finding putative exons and constructing splicegraphs using Trans-ABySS contigs
http://www.bcgsc.ca/platform/bioinfo/software/chopstitch
GNU General Public License v3.0
12 stars 1 forks source link

GFA output #7

Open ekg opened 5 years ago

ekg commented 5 years ago

I'm interested in using splice graphs as reference systems in vg. It would be possible to use ChopStitch's output to do so directly if it were able to write GFA format directly. I see that there is already output in dot, so this may be easy to do. Is there anything more complex to consider than copying https://github.com/bcgsc/ChopStitch/blob/master/MakeSplicegraph.py#L339-L359 and modifying it to produce GFA instead of dot?

hamzakhanvit commented 5 years ago

Hi,

Yes, you can look at write_dot(). However, an easier way would be to look at FindSubcomponents.py , which takes in the output DOT file from MakeSplicegraph.py and formats the output to produce a splice subgraph file where each subgraph represents a splice graph for a single gene (Use -w). By default, it generates a file with mappings of putative exons to genes which could also be parsed to generate a GFA output. I basically wrote this script to modify the DOT output as per the requirement of different users. Please feel free to let me know if you have any other questions.

Cheers, Hamza