Open wheaton5 opened 5 years ago
Hi Haynes,
I had a quick look at your data and run the dataset by myself. It seems that your problem was due to the way you run the pipeline. Normally 10X reads have a number of files in a directory and we suggest that the users make a file such as file.dat
q1=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S1_L001_R1_001.fastq.gz q2=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S1_L001_R2_001.fastq.gz q1=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S2_L001_R1_001.fastq.gz q2=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S2_L001_R2_001.fastq.gz q1=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S3_L001_R1_001.fastq.gz q2=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S3_L001_R2_001.fastq.gz q1=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S4_L001_R1_001.fastq.gz q2=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S4_L001_R2_001.fastq.gz
You then run the code in this way
scaff10x -nodes 30 {options} -data file.dat funestus_redbean.purged.fa funestus_redbean.scaff10x.fa
You will get a few files funestus_redbean.scaff10x.fa funestus_redbean.scaff10x.fa.agp funestus_redbean.scaff10x.fa.cov
If you use -plot funestus_redbean-bclength.png, you get the barcode length distribution as well.
Another way to run the code is like way you did:
scaff10x -nodes 30 funestus_redbean.purged.fa bamtofastq_S1_L000_R2_001.fastq.g bamtofastq_S1_L000_R2_001.fastq.gz hybrid_redbean2.purged.scaff.fa
However, the file bamtofastq_S1_L000_R1_001.fastq.gz and bamtofastq_S1_L000_R2_001.fastq.gz have to be processed using scaff_reads: barcode tags have been put at the end of the read name. Looks that you didn't do that.
As said before, please use -data file.dat.
Finally, the results of scaf10x are not good, see /lustre/scratch117/sciops/team117/hpag/zn1/project/worm
A lot of contigs have no read coverage at all. I don't know this is an sample issue or you didn't extract all the reads.
Zemin
I wanted to run it this way (.dat file) but got a segfault with that and as the command line usage (when not given any arguments) shows primarily the other way of running, I tried that next. So going back to the .dat way, i still get a segfault.
Segmentation fault scaff10x -nodes 30 -plot barcode_length.png -data scaffinput.dat An_arab_wm_A_ref.fa in directory /lustre/scratch118/malaria/team222/hh5/projects/analysis/assembly/arabiensis/falcon/haplomerger
Your input file q1=10x_reads/longranger222_wgs_29692_BAdASS7880765_arabiensis_freebayes_LibraryNotSpecified_1_unknown_fc/bamtofastq_S1_L000_bamtofastq_S1_L000_R1_001.fastq.gz q2=10x_reads/longranger222_wgs_29692_BAdASS7880765_arabiensis_freebayes_LibraryNotSpecified_1_unknown_fc/bamtofastq_S1_L000_bamtofastq_S1_L000_R2_001.fastq.gz q1=10x_reads/longranger222_wgs_29692_BAdASS7880765_arabiensis_freebayes_LibraryNotSpecified_1_unknown_fc/bamtofastq_S1_L000_bamtofastq_S1_L000_R1_002.fastq.gz q2=10x_reads/longranger222_wgs_29692_BAdASS7880765_arabiensis_freebayes_LibraryNotSpecified_1_unknown_fc/bamtofastq_S1_L000_bamtofastq_S1_L000_R2_002.fastq.gz q1=10x_reads/longranger222_wgs_29692_BAdASS7880765_arabiensis_freebayes_LibraryNotSpecified_1_unknown_fc/bamtofastq_S1_L000_bamtofastq_S1_L000_R1_003.fastq.gz q2=10x_reads/longranger222_wgs_29692_BAdASS7880765_arabiensis_freebayes_LibraryNotSpecified_1_unknown_fc/bamtofastq_S1_L000_bamtofastq_S1_L000_R2_003.fastq.gz q1=10x_reads/longranger222_wgs_29692_BAdASS7880765_arabiensis_freebayes_LibraryNotSpecified_1_unknown_fc/bamtofastq_S1_L000_bamtofastq_S1_L000_R1_004.fastq.gz q2=10x_reads/longranger222_wgs_29692_BAdASS7880765_arabiensis_freebayes_LibraryNotSpecified_1_unknown_fc/bamtofastq_S1_L000_bamtofastq_S1_L000_R2_004.fastq.gz
ls -l 10x_reads/longranger222_wgs_29692_BAdASS7880765_arabiensis_freebayes_LibraryNotSpecified_1_unknown_fc/bamtofastq_S1_L000_bamtofastq_S1_L000_R1_001.fastq.gz ls: cannot access
You need to give the full path. q1=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S1_L001_R1_001.fastq.gz q2=/lustre/scratch117/sciops/team117/hpag/zn1/project/mammals/mBosTau1/reads_10x/mBosTau1_S1_L001_R2_001.fastq.gz
Zemin
I have a debug version for which you can use an existing align.dat file:
/nfs/users/nfs_z/zn1/src/Scaff10X/src/scaff10x -nodes 2 {options} -plot bclength.png -noalign /lustre/scratch117/sciops/team117/hpag/zn1/project/worm/tmp_rununik_57281/align.dat funestus_redbean.purged.fa funestus_redbean.scaff10x-t2.fa > try.out
Note: give the full path of align.dat
Zemin
Ok I'm using full paths for all inputs now
./scaffme.sh File not in the working directory! File -data not found and please copy it to your working directory!
But it is in the current directory pwd /lustre/scratch118/malaria/team222/hh5/projects/analysis/assembly/arabiensis/falcon/haplomerger (base) haplomerger: ls scaffinput.dat scaffinput.dat
in scaffme.sh is cat scaffme.sh /nfs/users/nfs_s/sm15/dev/Scaff10X-4.1/src/scaff10x -nodes 30 -plot barcode_lengtg.png -data /lustre/scratch118/malaria/team222/hh5/projects/analysis/assembly/arabiensis/falcon/haplomerger/scaffinput.dat /lustre/scratch118/malaria/team222/hh5/projects/analysis/assembly/arabiensis/falcon/haplomerger/An_arab_wm_A_ref.fa
Change the scaffme.sh file into this
/nfs/users/nfs_s/sm15/dev/Scaff10X-4.1/src/scaff10x -nodes 30 -plot barcode_lengtg.png -data scaffinput.dat An_arab_wm_A_ref.fa An_arab_wm_A_ref-scaff10x.fa > try.out
./scaffme2.sh (base) haplomerger: cat try.out File not in the working directory! File -data not found and please copy it to your working directory! (base) haplomerger: cat scaffme2.sh /nfs/users/nfs_s/sm15/dev/Scaff10X-4.1/src/scaff10x -nodes 30 -plot barcode_lengtg.png -data scaffinput.dat An_arab_wm_A_ref.fa An_arab_wm_A_ref-scaff10x.fa > try.out
Don't know if Shane has got the recent function for "-plot". Please try to download and install the package for yourself. Or try
/nfs/users/nfs_z/zn1/src/Scaff10X/src/scaff10x
Sorry about all these problems!
Just tried /nfs/users/nfs_z/zn1/src/Scaff10X/src/scaff10x -nodes 52 -plot bclength.png -data file.dat funestus_redbean.purged.fa funestus_redbean.scaff10x-t2.fa >
and Shane's code is working!
Ok that seems to be working. Thanks for the help!
If i run without parameters, i get the usage as expected.
then if i run with parameters scaff10x -nodes 30 funestus_redbean.purged.fa bamtofastq_S1_L000_R1_001.fastq.gz bamtofastq_S1_L000_R2_001.fastq.gz hybrid_redbean2.purged.scaff.fa
I get ./scaffme2.sh: line 1: 31890 Segmentation fault
I'm on farm4 in directory /lustre/scratch118/malaria/team222/hh5/projects/analysis/assembly/arabiensis and created the fastq files from a longranger run vs a different assembly with 10x's bamtofastq which should put the reads in the same format as they were as input to longranger.