bcgsc / LongStitch

Correct and scaffold assemblies using long reads
GNU General Public License v3.0
45 stars 7 forks source link

core dumped error in long-to-linked-pe step #63

Closed nirvana693 closed 1 year ago

nirvana693 commented 1 year ago

Hi Lauren,

I am testing the tool on human genome with ULONT data (>100kb reads). I could run the tool without any errors on test dataset. But while running on the my dataset, encountered core dumped error in long-to-linked-pe step and all subsequent steps have different error codes. Here is the error (have attached the full log)

bash: line 1: 1714306 Aborted                 (core dumped) /home/sarashettp/miniconda3/envs/WGS/bin/share/arcs-1.2.2-0/Examples//../src/long-to-linked-pe -l 250 -t 55 -m2000 I002C_ULONT_no_chim_100k.fq.gz
     1714307 Done                    | arcs --arks -v -D -B 20 -f hap1_contigs.cut250.tigmint.fa.k24.w150.z1000.ntLink.scaffolds.renamed.fa -c 4 -m 8-10000 -r 0.05 -e 30000 -z 1000 -j 0.05 -k 20 -t 55 -d 0 --gap 100 -b hap1_contigs.cut250.tigmint.fa.k24.w150.z1000.ntLink.scaffolds_c4_m8-10000_cut250_k20_r0.05_e30000_z1000 -u I002C_ULONT_no_chim_100k.barcode-multiplicity.tsv /dev/stdin
make[1]: *** [/home/sarashettp/miniconda3/envs/WGS/bin/share/arcs-1.2.2-0/Examples/arcs-make:267: hap1_contigs.cut250.tigmint.fa.k24.w150.z1000.ntLink.scaffolds_c4_m8-10000_cut250_k20_r0.05_e30000_z1000_original.gv] Error 134
make[1]: *** Deleting file 'hap1_contigs.cut250.tigmint.fa.k24.w150.z1000.ntLink.scaffolds_c4_m8-10000_cut250_k20_r0.05_e30000_z1000_original.gv'
make[1]: Leaving directory '/mnt/projects/miles/ARG/EBV_temp/temp/LongStitch'
make: *** [/home/sarashettp/miniconda3/envs/WGS/bin/share/longstitch-1.0.4-0/longstitch:301: hap1_contigs.cut250.tigmint.fa.k24.w150.z1000.ntLink.scaffolds_c4_m8-10000_cut250_k20_r0.05_e30000_z1000_l4_a0.3.scaffolds.fa] Error 2

Despite the error, I do have a scaffold file softlinked to: hap1.scaffolds.fa -> hap1_contigs.cut250.tigmint.fa.k24.w150.z1000.ntLink.scaffolds.fa

I command used: longstitch tigmint-ntLink-arks draft=hap1_contigs reads=I002C_ULONT_no_chim_100k G=3100000000 w=150 k_ntLink=24 t=55 out_prefix=hap1

I am running the tool on a server with 2TB RAM. longstitch_log.txt

lcoombe commented 1 year ago

Hi @nirvana693,

So it looks like the error happened at the second (ARKS) scaffolding stage - the good news is that everything prior to that (so Tigmint+ntLink) looks fine and should be available in hap1_contigs.k24.w150.tigmint-ntLink.longstitch-scaffolds.fa. It's funny that you're seeing the core dump because long-to-linked-pe was already run in the Tigmint-long stage successfully on the same input reads.

Just to confirm that the issue is due to the ARKS long-to-linked-pe step, could you just run:

/home/sarashettp/miniconda3/envs/WGS/bin/share/arcs-1.2.2-0/Examples//../src/long-to-linked-pe -l 250 -t 55 -m2000 I002C_ULONT_no_chim_100k.fq.gz |pigz > 002C_ULONT_no_chim_100k.lr.fq.gz

This will only run that one step (I added the compressing and piping to file - change that as you see fit), and help with a sanity check that the core dump is from this, not from the ARKS pairing stage itself.

Thank you for your interest in LongStitch! Lauren

nirvana693 commented 1 year ago

Hi Lauren,

The long-to-linked-pe completed without any errors. The command used was:

/home/sarashettp/miniconda3/envs/WGS/bin/share/arcs-1.2.2-0/Examples//../src/long-to-linked-pe -l 250 -t 55 -m2000 I002C_ULONT_no_chim_100k.fq.gz |pigz > 002C_ULONT_no_chim_100k.lr.fq.gz

lcoombe commented 1 year ago

Ok that's good. And the file looks OK? (ie. if you look at a few lines, it's well-formatted fastq entries?

So now, you can try running the next half of the piped command:

arcs --arks -v -D -B 20 -f hap1_contigs.cut250.tigmint.fa.k24.w150.z1000.ntLink.scaffolds.renamed.fa -c 4 -m 8-10000 -r 0.05 -e 30000 -z 1000 -j 0.05 -k 20 -t 55 -d 0 --gap 100 -b hap1_contigs.cut250.tigmint.fa.k24.w150.z1000.ntLink.scaffolds_c4_m8-10000_cut250_k20_r0.05_e30000_z1000 -u I002C_ULONT_no_chim_100k.barcode-multiplicity.tsv 002C_ULONT_no_chim_100k.lr.fq.gz

Let me know if that is successful, and if not, please send the full log.

nirvana693 commented 1 year ago

Hi Lauren,

The above arcs command was successfully completed.

lcoombe commented 1 year ago

Hi @nirvana693,

I'm glad to hear that, although then I am having trouble explaining your core dump unfortunately, since I basically just asked you to run each individual step of the piped command that failed. Were you running everything in the identical environment?

Regardless, to finish the run now, you should be able to just run the same longstitch command as before, and it should recognize the files present and start at the correct step (after the ARCS step above). To check that it will commence at the expected step, you can run the command with a dry-run (add -n to your ARCS command)

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your interest in LongStitch!