Open atabeerk opened 1 year ago
@atabeerk Hi, I have the same problem and the Bandage warning me that the format is not correct. See attached warning. How should I solve it?
Thanks, Jianshu
Hi @jianshu93, thanks for reaching out. We will look into this.
Ataberk
@atabeerk,
Thanks for the quick response. The code is well-written, I can run it without any problems and produce expected output. Just the format (feel like a small bug). Let me know if you want my data to reproduce the error.
best,
Jianshu
@jianshu93, if you can share
strain_contigs.gfa
file that produces the error you mention
that would be very helpful. Feel free to attach the files to this issue or send an email to ataberk@umd.edu if that is what you prefer.
Ataberk
Hi @atabeerk, I shared with you the reads, metaFlye assembly graph and strainy graph output. I followed exact the same scripts as you suggested: flye --pacbio-hifi m84137_240709_192956_s1.hifi_reads.bc2076--bc2076.bam.fastq.gz -o metaflye -t 30 --meta --no-alt-contigs --keep-haplotypes -I 0 ./strainy.py --gfa_ref assembly_graph.gfa --fastq m84137_240709_192956_s1.hifi_reads.bc2076--bc2076.bam.fastq.gz --mode hifi -t 30 --output strainy_out
I've shared the input and output files with you via goole drive, let me know if you cannot access them.
Best, Jianshu
Hi @jianshu93, I got the files. I will keep you updated.
Best, Ataberk
In the output GFA file (strainy_final.gfa), some unitigs in L lines do not have corresponding S lines. This may be due to attempting to remove the unitigs at some point (and removing their S lines) but forgetting to remove the L lines in which these unitigs are used.
The attached file is the output of the mock ONT dataset. Some unitigs that have that issue:
edge_1291_139, edge_956_40, edge_874_33, edge_3054_s1_3041692, edge_3024_11380, edge_1553_1030193, edge_2864_1000769