wtsi-hpag / Scaff10X

Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
MIT License
20 stars 3 forks source link

scaff_bwa segmentation fault #18

Open shannonekj opened 4 years ago

shannonekj commented 4 years ago

Hi there,

I am encountering a segmentation fault when running Scaff10X.

[main] Version: 0.7.17-r1188
[main] CMD: /home/sejoslin/miniconda3/envs/scaffold_10x/bin/Scaff10X/scaff-bin/bwa mem -t 16 tarseq.fastq /group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/00-raw_data/data-10X_M/Male2_S63_L004_R1_001.fastq.gz /group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/00-raw_data/data-10X_M/Male2_S63_L004_R2_001.fastq.gz
[main] Real time: 140745.891 sec; CPU: 2245179.580 sec
Segmentation fault
Error running command: /home/sejoslin/miniconda3/envs/scaffold_10x/bin/Scaff10X/scaff-bin/scaff_bwa -edge 50000 tarseq.tag align.dat align2.dat > try.out

Here is the stderr file associated with the error: test_assembly.j23957420.run_scaff10x.hi.err.txt

Here is the stdout file associated with the job: test_assembly.j23957420.run_scaff10x.hi.out.txt

I have the following lines in the tmp_rununik_10443

-rw-rw-r-- 1 sejoslin millermrgrp    0 Jul  3 06:46 align2.dat
-rw-rw-r-- 1 sejoslin millermrgrp  23G Jul  3 06:46 align.dat
-rw-rw-r-- 1 sejoslin millermrgrp 930M Jul  1 15:31 tarseq.fastq
-rw-rw-r-- 1 sejoslin millermrgrp  257 Jul  1 15:38 tarseq.fastq.amb
-rw-rw-r-- 1 sejoslin millermrgrp 201K Jul  1 15:38 tarseq.fastq.ann
-rw-rw-r-- 1 sejoslin millermrgrp 465M Jul  1 15:38 tarseq.fastq.bwt
-rw-rw-r-- 1 sejoslin millermrgrp 117M Jul  1 15:38 tarseq.fastq.pac
-rw-rw-r-- 1 sejoslin millermrgrp 233M Jul  1 15:40 tarseq.fastq.sa
-rw-rw-r-- 1 sejoslin millermrgrp 187K Jul  1 15:31 tarseq.tag
-rw-rw-r-- 1 sejoslin millermrgrp    0 Jul  1 15:31 try.out

and my align.dat file looks like this:

(base) sejoslin@farm:tmp_rununik_10443$ head align.dat
A00351:291:HVMC5DSXX:4:1101:1307:1000 69 tarseq_648 132029 0
A00351:291:HVMC5DSXX:4:1101:1398:1000 99 tarseq_462 123666 60
A00351:291:HVMC5DSXX:4:1101:1524:1000 81 tarseq_236 424566 0
A00351:291:HVMC5DSXX:4:1101:1597:1000 97 tarseq_874 104495 0
A00351:291:HVMC5DSXX:4:1101:1687:1000 99 tarseq_16 418743 60
A00351:291:HVMC5DSXX:4:1101:1940:1000 97 tarseq_2149 50794 0
A00351:291:HVMC5DSXX:4:1101:2284:1000 99 tarseq_338 173397 60
A00351:291:HVMC5DSXX:4:1101:2302:1000 83 tarseq_302 99246 60
A00351:291:HVMC5DSXX:4:1101:2483:1000 83 tarseq_5085 17812 60
A00351:291:HVMC5DSXX:4:1101:2591:1000 99 tarseq_102 116777 41
(base) sejoslin@farm:tmp_rununik_10443$ tail align.dat
A00351:291:HVMC5DSXX:4:2678:13132:36949 99 tarseq_1383 26746 60
A00351:291:HVMC5DSXX:4:2678:13277:36949 65 tarseq_1446 66935 60
A00351:291:HVMC5DSXX:4:2678:13313:36949 99 tarseq_569 12813 60
A00351:291:HVMC5DSXX:4:2678:13639:36949 83 tarseq_1401 40168 27
A00351:291:HVMC5DSXX:4:2678:13747:36949 99 tarseq_84 92647 60
A00351:291:HVMC5DSXX:4:2678:14091:36949 99 tarseq_445 203003 60
A00351:291:HVMC5DSXX:4:2678:14561:36949 99 tarseq_512 71920 11
A00351:291:HVMC5DSXX:4:2678:14597:36949 83 tarseq_411 18587 60
A00351:291:HVMC5DSXX:4:2678:14868:36949 97 tarseq_2112 1850 10
A00351:291:HVMC5DSXX:4:2678:15157:36949 83 tarseq_613 158418 60

I ran Scaff10X with the following parameters:

#SBATCH -J hi_scf10
#SBATCH -e slurm/test_assembly.j%j.run_scaff10x.hi.err
#SBATCH -o slurm/test_assembly.j%j.run_scaff10x.hi.out
#SBATCH --nodes=1
#SBATCH --ntasks=16
#SBATCH --mem=480G
#SBATCH --time=06-10:08:07
#SBATCH -p bigmemh

and used the following command to run scaff10x:

${scaf_bin}/scaff10x \
    -nodes ${threads} \
    -align bwa \
    -matrix 2000 \
    -reads 12 \
    -link 10 \
    -plot barcode_length.png \
    ${asm} ${R1} ${R2} ${output}"

I didn't think this step much memory and indeed it fails at the same place if I use a partition with less memory (62000M/node).

Please advise! Thank you for your time :)

jensbast commented 3 years ago

Hi, I get exactly the same error with exactly the same files in the tmp (with file size 0 for both try.out and align2.dat). Was there a solution?

Best and thanks

zning-sanger commented 3 years ago

Many thanks for all your bug reporting efforts. I spent some time on this and sorry to say I haven't found the bug or the problem. VGP used a lot on scaff10x and it seems that they didn't have this issue and also I never had this problem. If any of the users had this problem and also put the data (reads and assembly) in a location for download, I would be really grateful! My problem is that I need to repeat this issue and then fix the bug. For the time being, please use scaffolding_reads to decode the barcodes and then use the paired reads for scaffolding.

Best regards,

Zemin

shannonekj commented 3 years ago

Hi all,

Apologies for not posting when I got things up and running. I believe I "solved" this (or at least I no longer run into the) problem by supplying the input 10X fastq's through -data input.dat (see format below) and sym-linking the reference genome to be in the working directory .

input.dat file format:

q1=/group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/00-raw_data/data-10X_M/Male2_S63_L004_R1_001.fastq.gz
q2=/group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/00-raw_data/data-10X_M/Male2_S63_L004_R2_001.fastq.gz

Run command:

scaff10x -nodes $((SLURM_CPUS_PER_TASK-2)) -longread 1 -gap 100 -matrix 2000 -reads 10 -link 8 -score 20 -edge 50000 -block 50000 -data input.dat sym-linked.reference.fasta output.scaff10x.fasta

which was run in the following working directory : /group/millermrgrp2/shannon/projects/assembly_genome_Hypomesus-transpacificus/03-assemblies/sandbox_hicanu/

As a note I installed scaff10x with conda.

@zning-sanger I'd be happy to upload the original files I used to a server if you'd like/it would be useful. I haven't tried to recreate the error since it was happening but maybe knowing how I fixed it helps??