Closed davidepisu closed 6 years ago
Normally, the generate-plot should create a file in the summary file. So there is something wrong there Something is really odd, you don't get any errors while running the generate-plots mode?
Can you provide the config.yaml file? I'm thinking maybe your datatype value is wrong. Is it SingleCell instead of singleCell?
I can't attach the file here. I copied the settings from here: https://github.com/Hoohm/dropSeqPipe/wiki/Create-config-files
Anyway this is my config file:
Samples: MLW15: fraction: 0.001 expected_cells: 2000 GENOMEREF: /SSD/ref/genome.fa REFFLAT: /SSD/ref/annotation.refFlat RRNAINTERVALS: /SSD/ref/genome.rRNA.intervals METAREF: /SSD/ref/STAR_INDEX_NO_GTF/ GTF: /SSD/ref/annotation.gtf SPECIES:
Ok, so datatype has to be either bulk or singleCell. And it is case sensitive. I will put some checks in.
Ok, I can try running the pipeline on another sample, setting singleCell instead of SingleCell in the config file.
Added a check for Data_type value. Please let me know if that fixed the issue.
Still getting the error at the fastqc...
/programs/FastQC-0.11.5/ MLW4_R1.fastq.gz MLW4_R2.fastq.gz -t 2 -o logs --extract
/bin/bash: /programs/FastQC-0.11.5/: Is a directory
[Sat Oct 14 23:39:40 2017] Error in job fastqc while creating output file logs/MLW4_R1_fastqc.html.
[Sat Oct 14 23:39:40 2017] RuleException:
CalledProcessError in line 23 of /programs/dropSeqPipe/lib/python3.6/site-packages/dropSeqPipe/Snakefiles/singleCell/fastqc.snake:
Command '/programs/FastQC-0.11.5/ MLW4_R1.fastq.gz MLW4_R2.fastq.gz -t 2 -o logs --extract' returned non-zero exit status 126.
File "/programs/dropSeqPipe/lib/python3.6/site-packages/dropSeqPipe/Snakefiles/singleCell/fastqc.snake", line 23, in rule_fastqc
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 55, in run
[Sat Oct 14 23:39:40 2017] Will exit after finishing currently running jobs.
[Sat Oct 14 23:39:40 2017] Exiting because a job execution failed. Look above for error message
Traceback (most recent call last):
File "/programs/dropSeqPipe/bin/dropSeqPipe", line 11, in
Pipeline has been updated to 0.24
Oh, I see now. Your fastqc path is wrong. You probably used something like: /path/to/fastqcFOLDER You should have /path/to/fastqc fastqc should be the executable.
Oh ok, now I get the following error:
Mode is generate-plots
Plotting knee plots
Error in file(con, "r") : cannot open the connection
Calls: yaml.load_file -> yaml.load -> paste -> readLines -> file
In addition: Warning message:
In file(con, "r") :
cannot open file '/SSD/MLW4config.yaml': No such file or directory
Execution halted
Traceback (most recent call last):
File "/programs/dropSeqPipe/bin/dropSeqPipe", line 11, in
Hello,
I know there is some error handling to do but this one is actually pretty straight forward:
cannot open file '/SSD/MLW4config.yaml': No such file or directory
This means you forgot the slash at the end of your -f arg.
You should use -f /SSD/MLW4/
instead of -f /SSD/MLW4
Gotcha, I think the problems are arising from a bad configuration file anyway. Now I get the following:
Mode is generate-plots
Plotting knee plots
Warning message:
In readLines(input, encoding = "UTF-8") :
incomplete final line found on '/SSD/MLW10/config.yaml'
Warning message:
Removed 1425492 rows containing missing values (geom_point).
Plotting base stats
Loading required package: magrittr
Warning message:
In readLines(input, encoding = "UTF-8") :
incomplete final line found on '/SSD/MLW10/config.yaml'
Error in mmm < each : comparison of these types is not implemented
Calls: plotRNAMetrics ... Reduce -> f -> rbind_gtable -> compare_unit -> unit -> comp
Execution halted
Traceback (most recent call last):
File "/programs/dropSeqPipe/bin/dropSeqPipe", line 11, in
My config.yaml is as follows:
Samples: MLW10: fraction: 0.001 expected_cells: 2000 GENOMEREF: /SSD/ref/genome.fa REFFLAT: /SSD/ref/annotation.refFlat RRNAINTERVALS: /SSD/ref/genome.rRNA.intervals METAREF: /SSD/ref/STAR_INDEX_NO_GTF/ GTF: /SSD/ref/annotation.gtf SPECIES:
So I don't get which lines I'm missing.....
@davidepisu the issue should be resolved thanks to @duyck Did it fix it for you?
Hello @davidepisu, could you test it out on the new version and tell me if it's fixed?
No response so I'll close the issue.
Got this error while generating the Expression Matrix:
[Sun Sep 10 01:02:04 2017] Finished job 1. [Sun Sep 10 01:02:04 2017] 4 of 5 steps (80%) done [Sun Sep 10 01:02:04 2017] [Sun Sep 10 01:02:04 2017] localrule all: input: logs/MLW12_hist_out_cell.txt log: logs/Dropseq_post_align.log jobid: 0 [Sun Sep 10 01:02:04 2017] [Sun Sep 10 01:02:04 2017] Finished job 0. [Sun Sep 10 01:02:04 2017] 5 of 5 steps (100%) done Mode is generate-plots Generating multiqc report [INFO ] multiqc : This is MultiQC v1.2 [INFO ] multiqc : Template : default [INFO ] multiqc : Searching '/SSD/MLW12/logs' [INFO ] multiqc : Searching '/SSD/MLW12/summary' Searching 62 files.. [####################################] 100%
[INFO ] star : Found 2 reports [INFO ] fastqc : Found 2 reports [INFO ] multiqc : Compressing plot data [INFO ] multiqc : Report : MLW12/multiqc_report.html [INFO ] multiqc : Data : MLW12/multiqc_data [INFO ] multiqc : MultiQC complete Extracting expression [Sun Sep 10 01:02:43 2017] Provided cores: 20 [Sun Sep 10 01:02:43 2017] Rules claiming more threads will be scaled down. [Sun Sep 10 01:02:43 2017] Job counts: count jobs 1 all 1 extract_expression 1 extract_umi_per_gene 1 gunzip 4 [Sun Sep 10 01:02:43 2017] [Sun Sep 10 01:02:43 2017] rule extract_umi_per_gene: input: MLW12_final.bam output: logs/MLW12_umi_per_gene.tsv jobid: 1 wildcards: sample=MLW12 [Sun Sep 10 01:02:43 2017] [Sun Sep 10 01:02:43 2017] /programs/Drop-seq_tools-1.12/GatherMolecularBarcodeDistributionByGene I=MLW12_final.bam O=logs/MLW12_umi_per_gene.tsv CELL_BC_FILE=summary/MLW12_barcodes.csv [Sun Sep 10 01:02:43 2017] rule extract_expression: input: MLW12_final.bam output: summary/MLW12_expression_matrix.txt.gz jobid: 3 wildcards: sample=MLW12 [Sun Sep 10 01:02:43 2017] [Sun Sep 10 01:02:43 2017] /programs/Drop-seq_tools-1.12/DigitalExpression I=MLW12_final.bam O=summary/MLW12_expression_matrix.txt.gz SUMMARY=summary/MLW12_dge.summary.txt CELL_BC_FILE=summary/MLW12_barcodes.csv MIN_BC_READ_THRESHOLD=1 [Sun Sep 10 01:02:44 EDT 2017] org.broadinstitute.dropseqrna.barnyard.DigitalExpression SUMMARY=summary/MLW12_dge.summary.txt OUTPUT=summary/MLW12_expression_matrix.txt.gz INPUT=MLW12_final.bam MIN_BC_READ_THRESHOLD=1 CELL_BC_FILE=summary/MLW12_barcodes.csv OUTPUT_READS_INSTEAD=false CELL_BARCODE_TAG=XC MOLECULAR_BARCODE_TAG=XM GENE_EXON_TAG=GE STRAND_TAG=GS EDIT_DISTANCE=1 READ_MQ=10 USE_STRAND_INFO=true RARE_UMI_FILTER_THRESHOLD=0.0 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json [Sun Sep 10 01:02:44 EDT 2017] org.broadinstitute.dropseqrna.barnyard.GatherMolecularBarcodeDistributionByGene OUTPUT=logs/MLW12_umi_per_gene.tsv INPUT=MLW12_final.bam CELL_BC_FILE=summary/MLW12_barcodes.csv CELL_BARCODE_TAG=XC MOLECULAR_BARCODE_TAG=XM GENE_EXON_TAG=GE STRAND_TAG=GS EDIT_DISTANCE=1 READ_MQ=10 MIN_BC_READ_THRESHOLD=0 USE_STRAND_INFO=true RARE_UMI_FILTER_THRESHOLD=0.0 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json [Sun Sep 10 01:02:44 EDT 2017] Executing as sb929@cbsumm07.tc.cornell.edu on Linux 3.10.0-229.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_121-b13; Picard version: 1.12(d3aeea7_1452606774) IntelDeflater [Sun Sep 10 01:02:44 EDT 2017] Executing as sb929@cbsumm07.tc.cornell.edu on Linux 3.10.0-229.el7.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_121-b13; Picard version: 1.12(d3aeea7_1452606774) IntelDeflater [Sun Sep 10 01:02:44 EDT 2017] org.broadinstitute.dropseqrna.barnyard.DigitalExpression done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=2022178816 Exception in thread "main" [Sun Sep 10 01:02:44 EDT 2017] org.broadinstitute.dropseqrna.barnyard.GatherMolecularBarcodeDistributionByGene done. Elapsed time: 0.00 minutes. Runtime.totalMemory()=2022178816 Exception in thread "main" htsjdk.samtools.SAMException: Error opening file: MLW12_barcodes.csvhtsjdk.samtools.SAMException: Error opening file: MLW12_barcodes.csv
Caused by: java.io.FileNotFoundException: summary/MLW12_barcodes.csv (No such file or directory)Caused by: java.io.FileNotFoundException: summary/MLW12_barcodes.csv (No such file or directory)
[Sun Sep 10 01:02:44 2017] Error in job extract_expression while creating output file summary/MLW12_expression_matrix.txt.gz. [Sun Sep 10 01:02:44 2017] Error in job extract_umi_per_gene while creating output file logs/MLW12_umi_per_gene.tsv. [Sun Sep 10 01:02:44 2017] RuleException: CalledProcessError in line 21 of /programs/dropSeqPipe/lib/python3.6/site-packages/dropSeqPipe/Snakefiles/singleCell/extract_expression_single.snake: Command '/programs/Drop-seq_tools-1.12/DigitalExpression I=MLW12_final.bam O=summary/MLW12_expression_matrix.txt.gz SUMMARY=summary/MLW12_dge.summary.txt CELL_BC_FILE=summary/MLW12_barcodes.csv MIN_BC_READ_THRESHOLD=1' returned non-zero exit status 1. File "/programs/dropSeqPipe/lib/python3.6/site-packages/dropSeqPipe/Snakefiles/singleCell/extract_expression_single.snake", line 21, in rule_extract_expression File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 55, in run [Sun Sep 10 01:02:44 2017] RuleException: CalledProcessError in line 34 of /programs/dropSeqPipe/lib/python3.6/site-packages/dropSeqPipe/Snakefiles/singleCell/extract_expression_single.snake: Command '/programs/Drop-seq_tools-1.12/GatherMolecularBarcodeDistributionByGene I=MLW12_final.bam O=logs/MLW12_umi_per_gene.tsv CELL_BC_FILE=summary/MLW12_barcodes.csv' returned non-zero exit status 1. File "/programs/dropSeqPipe/lib/python3.6/site-packages/dropSeqPipe/Snakefiles/singleCell/extract_expression_single.snake", line 34, in rule_extract_umi_per_gene File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 55, in run [Sun Sep 10 01:02:44 2017] Removing output files of failed job extract_umi_per_gene since they might be corrupted: logs/MLW12_umi_per_gene.tsv [Sun Sep 10 01:02:44 2017] Will exit after finishing currently running jobs. [Sun Sep 10 01:02:44 2017] Exiting because a job execution failed. Look above for error message Traceback (most recent call last): File "/programs/dropSeqPipe/bin/dropSeqPipe", line 11, in
load_entry_point('dropSeqPipe==0.23a0', 'console_scripts', 'dropSeqPipe')()
File "/programs/dropSeqPipe/lib/python3.6/site-packages/dropSeqPipe/main.py", line 223, in main
shell(extract_expression_single)
File "/usr/local/lib/python3.6/site-packages/snakemake/shell.py", line 88, in new
raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'snakemake -s /programs/dropSeqPipe/lib/python3.6/site-packages/dropSeqPipe/Snakefiles/singleCell/extract_expression_single.snake --cores 20 -pT -d /SSD/MLW12 --configfile /SSD/local.yaml ' returned non-zero exit status 1.