wtsi-hpag / Scaff10X

Pipeline for scaffolding and breaking a genome assembly using 10x genomics linked-reads
MIT License
20 stars 3 forks source link

another seg fault issue #14

Open macmanes opened 4 years ago

macmanes commented 4 years ago

sorry..

step 1. binary executes, results in help menu.

(scaff10x) [macmanes@premise genome]$ scaff10x
Program: scaff10x - Genome Scaffolding using 10X Chromium Data
Version: 4.2

Usage: scaff10x -nodes 30 -longread 1 -gap 100 -matrix 2000 -reads 10 -score 20 -edge 50000 -link 8 -block 50000 -plot barcode_lengtg.png <input_assembly_fasta/q_file> <Input_read_1>> <Input_read_2> <Output_scaffold_file>
       nodes    (30)    - number of CPUs requested
       matrix   (2000)  - relation matrix size
       reads    (10)    - step 1 and 2: minimum number of reads per barcode
       link     (8)     - step 1 and 2: minimum number of shared barcodes
       read-s1 (8)     - step 1: minimum number of reads per barcode
       read-s2 (10)    - step 2: minimum number of reads per barcode
       link-s1  (8)     - step 1: minimum number of shared barcodes
       link-s2  (10)    - step 2: minimum number of shared barcodes
       edge     (50000) - edge length to consider for scaffolding
       score    (20)    - minimum average mapping score on a barcode covered area
       block    (50000) - length to determine for nearest neighbours
       longread (1)     - contigs were produced using PacBio or ONT reads
                (0)     - contigs were produced from short reads such as Illumina
       file     (0)     - do not output the sam file in order to save disk space
                (1)     - sam file from bwa is saved
       gap      (100)   - gap size in building scaffold
       sam      ()      - previously aligned sam file by bwa
       bam      ()      - previously aligned bam file by longrange
       plot     (barcode_lengtg.png) - output image file on barcode length distributions

step 2. seg fault with dat file

(scaff10x) [macmanes@premise genome]$ scaff10x -data reads.dat Neotomodon_alstoni_10x_v6.fasta Neotomodon_alstoni_10x_v7.fasta
Segmentation fault

here is the dat file

q1=/mnt/lustre/macmaneslab/shared/pero_genomes/Neotomodon_alstoni/reads/NEAL_S1_L006_R1_001.fastq.gz
q2=/mnt/lustre/macmaneslab/shared/pero_genomes/Neotomodon_alstoni/reads/NEAL_S1_L006_R2_001.fastq.gz

step 3. segfault with passing reads on the command line

(scaff10x) [macmanes@premise genome]$ scaff10x Neotomodon_alstoni_10x_v6.fasta ../reads/NEAL_S1_L006_R1_001.fastq.gz ../reads/NEAL_S1_L006_R2_001.fastq.gz  Neotomodon_alstoni_10x_v7.fasta
Segmentation fault
zning-sanger commented 4 years ago

Hi Matt,

I have seen anything wrong here for the usage of the code. For any run of scaff10x, it produces a directory such as tmprununik***** to process all the files. Could you tell me the contents in the directory, by doing "ls -lrt"?

Thanks,

Zemin

zning-sanger commented 4 years ago

Sorry I haven't seen anything wrong.

macmanes commented 4 years ago

It's empty:

(scaff10x) [macmanes@premise genome]$ ls -lthr tmp_rununik_165217
total 0
macmanes commented 4 years ago

@zning-sanger, do you have a small test dataset I could use to make sure this it is not a problem with the installation?

macmanes commented 4 years ago

Scratch that. I've identified the bug:

/mnt/lustre/macmaneslab/macmanes/Scaff10X/src/scaff-bin/scaff10x  /mnt/lustre/macmaneslab/shared/pero_genomes/Neotomodon_alstoni/genome/Neotomodon_alstoni_10x_v6a.fasta /mnt/lustre/macmaneslab/shared/pero_genomes/Neotomodon_alstoni/reads/NEAL_S1_L006_R1_001.fastq.gz /mnt/lustre/macmaneslab/shared/pero_genomes/Neotomodon_alstoni/reads/NEAL_S1_L006_R2_001.fastq.gz  Neotomodon_alstoni_10x_v234.fasta
File not in the working directory!
Input target assembly file1: /mnt/lustre/macmaneslab/shared/pero_genomes/Neotomodon_alstoni/genome/Neotomodon_alstoni_10x_v6a.fasta
File not in the working directory!
Input read1 file: /mnt/lustre/macmaneslab/shared/pero_genomes/Neotomodon_alstoni/reads/NEAL_S1_L006_R1_001.fastq.gz
File not in the working directory!
Input read2 file: /mnt/lustre/macmaneslab/shared/pero_genomes/Neotomodon_alstoni/reads/NEAL_S1_L006_R2_001.fastq.gz
sh: /mnt/lustre/macmaneslab/macmanes/Scaff10X/src/scaff-bin/scaff-bin/scaff_fastq: No such file or directory
Error running command: /mnt/lustre/macmaneslab/macmanes/Scaff10X/src/scaff-bin/scaff-bin/scaff_fastq -name tarseq -len 10 /mnt/lustre/macmaneslab/shared/pero_genomes/Neotomodon_alstoni/genome/Neotomodon_alstoni_10x_v6a.fasta tarseq.fastq tarseq.tag > try.out

See the path for scaff_fastq? It thinks it's in /mnt/lustre/macmaneslab/macmanes/Scaff10X/src/scaff-bin/scaff-bin/, there is an "extra" scaff-bin/ in the path. I'm guessing this is not intentional, but I can confirm when I make that directory and copy all the executables in there, that things seem to be working - at least I do not have the same seg fault.

zning-sanger commented 4 years ago

Thanks Mat for the information. I don't know how you installed the package. I just did a test

$ git clone https://github.com/wtsi-hpag/Scaff10X.git $ cd Scaff10X $ ./install.sh

By setting CC= /software/gcc-4.9.2/bin/gcc, it works on the test sample.

Let me know if you have further issues.

Thanks again,

Zemin