RolandFaure / Hairsplitter

Software that separates very close sequences that have been collapsed during assembly. Uses only long reads.
GNU General Public License v3.0
33 stars 0 forks source link

ERROR in: Now looping and iteratively modify the GFA until all reads align end-to-end on the assembly graph #10

Open octpalacios opened 2 weeks ago

octpalacios commented 2 weeks ago

Hi, I’m using Hairsplitter on a large repetitive genome assembled with Flye, but I encountered the following error:

==== Now looping and iteratively modify the GFA until all reads align end-to-end on the assembly graph ====

Loop iteration 0...

Do you have any suggestions on how to fix this? Thanks in advance.

RolandFaure commented 2 weeks ago

Hi,

This is clearly a bug, which may originate from the very repetitive nature of your genome. I need to correct this. If you want a workaround, you can skip step 0 of the pipeline by toggling the --correct-assembly flag, but this will diminish the quality of the result.

To solve this bug, could you show me your command line ? What version of HairSplitter are you using ? And would it be possible to share the files so that I can take a look, or are the files confidential or too big ?

octpalacios commented 2 weeks ago

Hi,

I'm currently using HairSplitter v1.9.10, installed via Conda. I'm about to try updating it. I'm already using the "--correct-assembly" flag. Here’s the command I'm running:

hairsplitter.py -f $READS -i $ASM -x pacbio --correct-assembly -t 16 -o $OUT --clean --resume

Which files do you need? The GFA assembly and the reads? I can possibly share them via email if you keep them confidential. Thanks in advance.

RolandFaure commented 2 weeks ago

Yes, the GFA and the reads would be perfect for me, at roland.faure@irisa.fr. I will delete all the files once the debugging is done. Not using --correct-assembly will skip over the buggy part but will decrease the quality of the results.

octpalacios commented 1 week ago

Hi Roland,

The files are quite large, and I’m working on a way to send them. In the meantime, I’ll try running Hairsplitter without the "--correct-assembly" flag.

Thanks.

RolandFaure commented 1 week ago

Hi,

If this help, the qualities of the reads in the fastq file are not used, hence a fasta file of the reads is enough.

Roland

De: "Octavio M. Palacios-Gimenez" @.> À: "RolandFaure/Hairsplitter" @.> Cc: "roland faure" @.>, "Comment" @.> Envoyé: Mercredi 30 Octobre 2024 13:48:09 Objet: Re: [RolandFaure/Hairsplitter] ERROR in: Now looping and iteratively modify the GFA until all reads align end-to-end on the assembly graph (Issue

10)

Hi Roland,

The files are quite large, and I’m working on a way to send them. In the meantime, I’ll try running Hairsplitter without the "--correct-assembly" flag.

Thanks.

— Reply to this email directly, [ https://github.com/RolandFaure/Hairsplitter/issues/10#issuecomment-2447035877 | view it on GitHub ] , or [ https://github.com/notifications/unsubscribe-auth/APJJHA2RYAORYKQJB4MKYYLZ6DIQTAVCNFSM6AAAAABQ3PXKHCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBXGAZTKOBXG4 | unsubscribe ] . You are receiving this because you commented. Message ID: < @.*** >