fg6 / smis

SMIS: Single Molecular Integrative Scaffolding
GNU General Public License v3.0
1 stars 2 forks source link

Segmentation fault #4

Open timkahlke opened 6 years ago

timkahlke commented 6 years ago

Trying to run the new version but get Segmentation fault $bindir/smis_shred -rlength $fakelen -step $step -minlen $minlen $fqfile fakemates_1.fastq fakemates_2.fastq >> $outp

fg6 commented 6 years ago

Hi, sorry you are having problems with SMIS. Could you send me a bit more information on what kind of data you are trying to run ? and could you send me the whole output of the pipeline?

Thank you! Francesca

timkahlke commented 6 years ago

Hi Francesca,

I'm trying to run ~200k nanopore reads with average length ~6000 nucleotides to scaffold a 30MB draft genome.

Unfortunately, there is no more output about the error and, because smis_shred does not produce the two artificial fastq files the rest of the pipeline complains about not having those files (see below).

I also tried to run smis_shred stand-alone on mutliple files always with the same result. I thought it might be a compiler version problem but had the same problem with gcc4.2.1 (MAC), 4.4.7. and 4.9.4 (Centos6).

./mysmissv.sh: line 94: 134729 Segmentation fault      $bindir/smis_shred -rlength $fakelen -step $step -minlen $minlen $fqfile fakemates_1.fastq fakemates_2.fastq >> $outp
[bwa_index] Pack FASTA... 0.22 sec
[bwa_index] Construct BWT for the packed sequence...
[BWTIncCreate] textLength=64874730, availableWord=16564580
[BWTIncConstructFromPacked] 10 iterations done. 27323322 characters processed.
[BWTIncConstructFromPacked] 20 iterations done. 50475786 characters processed.
[bwt_gen] Finished constructing BWT in 27 iterations.
[bwa_index] 14.19 seconds elapse.
[bwa_index] Update BWT... 0.15 sec
[bwa_index] Pack forward-only FASTA... 0.13 sec
[bwa_index] Construct SA from BWT and Occ... 5.00 sec
[main] Version: 0.7.12-r1039
[main] CMD: /BWA_DIR/current/bwa index genome.fasta
[main] Real time: 20.695 sec; CPU: 19.696 sec
open: No such file or directory
[bam_sort_core] fail to open file bwa_sorted.bam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[gzclose] buffer error
[samopen] SAM header is present: 64 sequences.
[sam_read1] reference 'ID:bwa   PN:bwa  VN:0.7.12-r1039 CL:/BWA_DIR/current/bwa mem -t 25 -T 50 -A 2 -O -1 -E -1 -B -1 genome.fasta fakemates_1.fastq fakemates_2.fastq
' is recognized as '*'.
[main_samview] truncated file.
Starting at 1518640273
Could not open input BAM files.
wc: WORKING/DIR/smis_scaffolding/tempWork/genome-matepair-*: No such file or directory
in files.txt: smalt (bwa) data line should be <filename> <insert size> <standard deviation> <weight> <read length> <orientation, must = "in" | "out"> 
mv: cannot stat `spinner-sp2b.fasta': No such file or directory

 Scaffolds are in spinner_scaffolds.fasta
 Summary of parameters used are in  /WORKING_DIR/smis_scaffolding/logs/launchedas_1518640243.txt
 Log is in  /PATH/TO/LOG/output_1518640243.txt
fg6 commented 6 years ago

Hi, thanks for the details. In order to debug this problem, could you please reply to the questions below I will try to fix the problem as quickly as possible.

  1. I am assuming you are running with standard parameters, or are you manually setting any ?
  2. Could you tell me which is your shortest read and your longest read?
  3. Which is the typical read name ? (Just to check if the needed string length is longer than allowed now)
  4. I understand smis_shred crashed, but did it started writing the fastq files fakemates_1.fastq fakemates_2.fastq or they don't even exist/are empty?

Also, if you have not done this, download and compile the version in the Sanger organization: https://github.com/wtsi-hpag/smis . We can continue the discussion here though.

Thank you, Francesca

timkahlke commented 6 years ago
  1. Yep, all default parameters

  2. I tried it with two files: read length of 300-120,000 and another one with read lengths 5,000-120,000

  3. Read names are like this: 0b847c36-b4eb-46a3-a703-c5e44a7b75da

  4. It created the files but both are empty.

I initially tried the version you pointed to but had the same problem. Also I couldn't add an issue on the other repo that's why I came here :)

fg6 commented 6 years ago

Hi, sorry don't know why but github deleted my message from yesterday, maybe you received it? Anyways, I just mentioned yesterday that I added a test example with e.coli data on the https://github.com/wtsi-hpag/smis repository. Can you please update your repo and try the test? Hopefully this will tell us if there is a system issue or a data issue.

By the way, thanks for letting me know about the missing issues option on the organization repo, I think I fixed that.

Thank you

lakhujanivijay commented 6 years ago

Hi Francesca

I am working with the latest version and I am encountering the same issue. Could you please help ? My command is

smis_pipeline -nodes 55 sample.fq sample.contigs.fasta sample_scaffolds.fasta

sample.contigs.fasta has been generated from canu genome assembler. I want to mention that I have been working with a couple of samples of similar kind (same run) for which it worked without any issue. Any pointers would be helpful.

Here are answers for your queries

Ques I am assuming you are running with standard parameters, or are you manually setting any ?

Yes

Ques Could you tell me which is your shortest read and your longest read?

file format type num_seqs sum_len min_len avg_len max_len
sample.fq FASTQ DNA 232986 700531577 51 3006.8 46662

Ques Which is the typical read name ? (Just to check if the needed string length is longer than allowed now)

Here are those

m54079_180523_193451/23462106/29326_32162           
m54079_180523_193451/23593208/35167_38003           
m54079_180523_193451/24707440/74529_77365           
m54079_180523_193451/25625293/61082_63918           
m54079_180523_193451/26542980/37738_40574           
m54079_180523_193451/28312032/37228_40064           
m54079_180523_193451/30147290/16309_19145           
m54079_180523_193451/30408936/11588_14424           
m54079_180523_193451/30737165/23904_26740       
m54079_180523_193451/31457850/23431_26267   

Ques I understand smis_shred crashed, but did it started writing the fastq files fakemates_1.fastq fakemates_2.fastq or they don't even exist/are empty?

I could not find those files. Where can I find that?

Looking forward to hear from you soon.

Regard Vijay Lakhujani