Closed gaynora7 closed 11 months ago
Hi, could be that the count matrix is empty. Can you open the file: /vcu_gpfs2/home/gaynora/DROP/udx_1015/Output/processed_data/aberrant_expression/v104/outrider/outrider/total_counts.Rds How does it look like? does it have full rows or columns with all 0s?
Hi Vicente,
Thanks for your quick response. When I opened "total_counts.Rds" it appeared that it is columns with all 0s.
Also I think for some reason this issue has to do with the SRA BAM files ("SRR" files in sample annotation) I included in this run.
When I ran snakemake aberrantExpression
only with BAM files sequenced through our lab (so no SRA BAMs)-- the pipeline ran successfully.
Unsure why the SRA BAM files seem to be throwing an error :( Maybe some kind of incompatibility from SRA Run Selector?
The SRA BAM files seem to be fine to me-- they are sorted, indexed, and I ran them through samtools quickcheck
- it all passed.
Thank you so much for your help and consideration!
can you verify that the genome build, paired-end and strand configurations of the SRA samples are indeed the ones you indicated in the config and sample annotation?
Yes! I just checked with salmon quant
to verify everything was correct-- and paired-end and strandedness was entered accurately in the config and sample annotation files.
weird.. can you try counting one of your SRR samples following the steps of this script: https://github.com/gagneurlab/drop/blob/master/drop/modules/aberrant-expression-pipeline/Counting/countReads.R?
the count_ranges is located under: root/processed_data/aberrant_expression/{annotation}/count_ranges.Rds
the different parameters come from either the sample annotation or the config file
Hi Vicente,
Thanks again for getting back to me. Just wanted to update-- I am getting this error when I try to run ONLY the SRA files with snakemake aberrantExpression
:
_FileNotFoundError in file /home/DROP/test/Snakefile, line 12:
File mapping is empty. Please check that all files in your sample annotation exist.
File "/home/DROP/test/Snakefile", line 12, in
I have checked my sample annotation file-- all paths to the SRA BAM files are correct. The are also intact-- they all passed samtools quickcheck
can you execute:
snakemake --cores 1 sampleAnnotation
?
so if you combine SRA with your samples, it does work?
Hi Vicente,
When I ran snakemake --cores 1 sampleAnnotation
I got this error:
_FileNotFoundError in file /vcu_gpfs2/home/gaynora/DROP/test_NRB/Snakefile, line 12:
File mapping is empty. Please check that all files in your sample annotation exist.
File "/vcu_gpfs2/home/gaynora/DROP/test_NRB/Snakefile", line 12, in
And this was when I was running ONLY the SRA samples. Ive attached the updated sample annotation file with only SRA samples.
It seems like theres an issue where the SRA BAMs are not making on the file_mapping.csv. The BAMs from my lab are automatically on there when they are run with snakemake
, but no SRAs.
I tried to manually add the SRA BAM paths to file_mapping.csv, but when I reran snakemake --cores 1 sampleAnnotation
I still got the same error.
Maybe it's an issue with the sample nomenclature that is messing up recognition of the file?
Thanks again so much!
There seems to be a space between the values of RNA_BAM_FILE and DROP_GROUP in your sample annotation. Can you please remove it and try again?
Hi Vicente,
I tried both of the above things with the SRA samples (reducing space between columns in sample annotation file, and adding sex column) and both still gave me this error when I ran snakemake --cores 1 sampleAnnotation
:
_FileNotFoundError in file /vcu_gpfs2/home/gaynora/DROP/test_NRB/Snakefile, line 12: File mapping is empty. Please check that all files in your sample annotation exist. File "/vcu_gpfs2/home/gaynora/DROP/test_NRB/Snakefile", line 12, in File "/vcu_gpfs2/home/gaynora/mambaforge/envs/drop_env/lib/python3.11/site-packages/drop/config/DropConfig.py", line 50, in init File "/vcu_gpfs2/home/gaynora/mambaforge/envs/drop_env/lib/python3.11/site-packages/drop/config/SampleAnnotation.py", line 29, in init File "/vcu_gpfs2/home/gaynora/mambaforge/envs/drop_env/lib/python3.11/site-packages/drop/config/SampleAnnotation.py", line 108, in createSampleFileMapping _
Something about the SRA BAM files is precluding DROP's ability to recognize them. I am going to try to run OUTRIDER separately from the pipeline, hopefully it will work!
that's weird. Where exactly did you download those BAM files from? I could try downloading them and testing DROP on my side.
Sure! They are from here: https://www.ncbi.nlm.nih.gov/Traces/study/?query_key=1&WebEnv=MCID_646f8aeb476027005f43ae6f&o=acc_s%3Aa
I used sra-tools prefetch
followed by the fasterq-dump
command to extract the fastqs. Then I aligned them to Hg38 via STAR 2.7.9, in the exact same way I aligned the samples sequenced from my lab (which worked successfully in DROP pipeline).
Thanks so much for your time and energy, much appreciated!
Also just wanted to note: I tried using DROP with other BAMs I got from different SRA accessions then the one I linked above, and they all failed. So it most likely is an incompatibility with SRA, and not this individual publication!
UPDATE:
I was able to get snakemake aberrantExpression
to work with my SRA samples. Unfortunately I misinterpreted salmon quant
results from my BAM files-- and I entered the wrong STRAND
notation in the sample annotation file. I then looked at more examples of salmon quant
output and realized my mistake. Whoops- but glad it works now! I also updated DROP to the dev
branch and that helped a ton too.
great to know! can we close the issue or is there something pending?
We can close it, thank you!
Hello,
I was attempting to run
snakemake aberrantSplicing --cores 7
and I received the error (full error log attached):**_Warning messages: 1: In DESeqDataSet(se, design = ~1, ...) : all genes have equal values for all samples. will not be able to perform differential analysis 2: In OutriderDataSet(counts) : No sampleID was specified. We will generate a generic one. Error in colSums(cutoffPassedMatrix) : 'x' must be an array of at least two dimensions Calls: filterExpression ... filterExp -> computeExpressedGenes -> data.table -> colSums Execution halted [Mon May 15 17:30:52 2023] Error in rule AberrantExpression_pipeline_Counting_filterCounts_R: jobid: 3 input: /home/DROP/udx_1015/Output/processed_data/aberrant_expression/v104/outrider/outrider/total_counts.Rds, /home/DROP/udx_1015/Output/processed_data/preprocess/v104/txdb.db, Scripts/AberrantExpression/pipeline/Counting/filterCounts.R output: /home/DROP/udx_1015/Output/processed_results/aberrant_expression/v104/outrider/outrider/ods_unfitted.Rds log: /home/DROP/udx_1015/.drop/tmp/AE/v104/outrider/filter.Rds (check log file(s) for error details)
Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message_**
I am not quite sure what to do-- any suggestions? Also just a note- I am using drop 1.3.3 so I am unsure why it says "Update drop version for /home/DROP/udx_1015 to version 1.3.3" at the beginning of the error!
My sample annotation file and config.yaml are attached.
Thank you!
error.drop.5.15.23.txt test_sampleannotation.xlsx config.yaml.txt