wtsi-hpag / Scaff10X

Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
MIT License
20 stars 3 forks source link

scaff_reads: Segmentation fault #17

Open zj-wien opened 4 years ago

zj-wien commented 4 years ago

Hi there,

scaff_reads can only produce genome-BC_1.fastq.gz, but not genome-BC_2.fastq.gz. Here I list information as follows. Please help to fix it.

Thanks a lot

command line:

Scaff10X/src/scaff_reads -nodes 30 input.dat genome-BC_1.fastq.gz genome-BC_2.fastq.gz

# error information 58211 Segmentation fault Scaff10X/src/scaff-bin/scaff_BC-reads-2 MySample_S1_L001_R1_001.fastq.name MySample_S1_L001_R2_001.fastq MySample_S1_L001_R2_001.fastq.RC2 > try.out

cups & mem

SBATCH --cpus-per-task=30

SBATCH --mem=500G

input.dat

q1=MySample_S1_L001_R1_001.fastq (base size: 150Gb ) q2=MySample_S1_L001_R2_001.fastq (base size: 150Gb)

fastq

@CL100073098L1C001R001_7 CCCAATGGGACAATGGCAGGGCTGCCTATGGGGGAACCGGCATTGCTGTGAGGGTCGGGGGGACTATTGTATCTGTAAAGGATCAGCCATGGCCAGAAGTAGGTTTCTGAGCTGAGCGGTGACAGACTGTGCCCTTTTCCTGGCAGGAGG + @:GFFDFFFFF9FFFCFGFFDFGEFFE@FFFFFDEF@FF:F?DFGFF@FGFFFEDFEFFFGF;=FFFFGFFGGFFEBFCGE5F>BCFFBFFFFFFCF1BFAGFCDEFEF7FEFF,FGFADF3DBD=FFC84FFFGBFF7GF:DFDFF=EF @CL100073098L1C001R001_9 CTGCGTTTCGCGGCATGCTTTCTAGAAGCTTAAGTTGTCTGTTTTTCCACCCTCCAAATTGTCTGACCACTTGTTGATAGTAGCAATTCCATTTTAATACCTTATGTCATAAGTATTTTAAGCAACCAAAAGATTCCTTTATTTTTTGCA + FFFGFFFFFGGB;FFGGFEGEGEEGGCFGEEE=GFFGEGEGFFGGCEGFGDFGBFFBBGFEGDFEFEGBEFBBGGG:GGBDFDFDGGGF?ECF@F@GEAEAEEEEEGF>GDFDEEEECFF,GFFFE1FGGBEGCEG@EAC?DCGEAEB5@

zning-sanger commented 4 years ago

Hi,

Thanks for the email. When you run

/lustre/scratch117/sciops/team117/hpag/zn1/project/bird/hummingbird/QC/10x/bCalAnn1_S1_L001_R1_001.fastq.gz

this will produce a temporary directory which contains all the files. Could you do ls -lrt and send me the file list?

Best regards,

Zemin Ning

zj-wien commented 4 years ago

oops~ I deleted them all.

I think scaff_reads can only handle a certain volume of reads. Because scaff_reads works after I split the giant fastq file into ~15 files (6Gb gzipped file).

On Mon, May 11, 2020 at 11:30 AM Zemin Ning notifications@github.com wrote:

Hi,

Thanks for the email. When you run

/lustre/scratch117/sciops/team117/hpag/zn1/project/bird/hummingbird/QC/10x/bCalAnn1_S1_L001_R1_001.fastq.gz

this will produce a temporary directory which contains all the files. Could you do ls -lrt and send me the file list?

Best regards,

Zemin Ning

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/wtsi-hpag/Scaff10X/issues/17#issuecomment-626587375, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANTNC7VEYEBQXDFKQLTF5LTRQ7AUPANCNFSM4M44ZNAQ .

-- Zongji Wang

zning-sanger commented 4 years ago

You can run scaff10x directly, rather than run staff_reads to get two read files, basically you don't need them. It saves disk space when you use "-data file.dat".