Closed mdrnao closed 2 years ago
Further tests today!
Trying with your test fastq files and elongating my own file names, today I get stuck at a different error message consistently:
Running QC stats collection
- python3 /recount-unify/log_qc/parse_logs_for_qc.py --incoming-dir links --sample-mapping /container-mounts/working/ids.tsv --intron-sums intron_counts_summed.tsv ++ cat qc_1.tsv ++ wc -l
- num_samples_qc=1
- num_samples_qc=0
- [[ 0 -ne 1 ]]
- echo 'FAILURE in pump output QC stats collection (post gene/exon unify), # QC rows (0) != # samples (1)' FAILURE in pump output QC stats collection (post gene/exon unify), # QC rows (0) != # samples (1)
- exit -1
hi @mdrnao
In your first post, it'd be good to see the contents of the following files: /container-mounts/working/ids.tsv /container-mounts/working/sample_metadata.tsv (you may have posted this one already, but I need to make sure)
but yes, I suggest elongating your sample names to at least 6 characters and re-running pump with the updated names and then unifier.
In your last post, to clarify, you are using these FASTQs(?)
http://snaptron.cs.jhu.edu/data/temp/SRR390728_1.fastq.gz
http://snaptron.cs.jhu.edu/data/temp/SRR390728_2.fastq.gz
also, what is the exact command you are running in both posts?
thanks, Chris
Hi Chris,
Thanks so much for getting back to me, I have a feeling this might be user error!
For simplicity, I'll display everything using the fastq files you provided.
I call pump like so, which ends with success:
/bin/bash ~/<full path>/recount/monorail-external/singularity/run_recount_pump.sh \
~/<full path>/recount/recount-rs5_latest.sif \
SRR390728 \
local \
grcm38 \
20 \
~/<full path>/recount/references \
~/<full path>/recount/test_set/SRR390728_1.fastq.gz \
~/<full path>/recount/test_set/SRR390728_2.fastq.gz \
SRR390728
I then call unify with:
/bin/bash ~/<full path>/recount/monorail-external/singularity/run_recount_unify.sh \
recount-unify_1.0.9.sif \
grcm38 \
~/<full path>/recount/references \
~/<full path>/recount/test_set/unify/ \
~/<full path>/recount/test_set/pump/pump_output\
~/<full path>/recount/test_set/unify/sample_metadata.tsv \
20 \
examplefiles:110
in which the end of the nohup.out file displays:
Running QC stats collection
+ python3 /recount-unify/log_qc/parse_logs_for_qc.py --incoming-dir links --sample-mapping /container-mounts/working/ids.tsv --intron-sums intron_counts_summed.tsv
++ cat qc_1.tsv
++ wc -l
+ num_samples_qc=1
+ num_samples_qc=0
+ [[ 0 -ne 1 ]]
+ echo 'FAILURE in pump output QC stats collection (post gene/exon unify), # QC rows (0) != # samples (1)'
FAILURE in pump output QC stats collection (post gene/exon unify), # QC rows (0) != # samples (1)
+ exit -1
INFO: Cleaning up image...
my sample_metadata.tsv file looks like so:
study_id sample_id
testset SRR390728
and the ids.tsv file:
testset SRR390728 904784
I also have a patroller_find_done.tsv file:
testset links/et/testset/28/SRR390728/SRR390728_in2 0 links/et/testset/28/SRR390728/SRR390728_in2_att0 SRR390728!testset!grcm38!local.manifest SRR390728_in2_att0
and for completeness, qc_1.tsv looks like so:
study sample bc_auc.all_% bc_auc.unique_% exon_fc.all_% exon_fc.unique_% gene_fc.all_% gene_fc.unique_% intron_sum intron_sum_% star.all_mapped_reads star.all_mapped_reads2
I think this should satisfy the 6-character suggestion.
Thank you for the help, Holly
Hi @mdrnao,
Thanks for the additional information and your patience.
So in the pump command, the last argument SRR390728
should be testset
to match what you're running in the unifier. That is, when running samples, Monorail needs to know at least 3 things: 1) sample ID/name 2) study ID/name and 3) project name:project_id (aka datasource). And these need to be consistent throughout the process.
While you could try to switch the unifier to use SRR390728
as the study to match what you already did in pump, I'm not sure what will happen with both the sample and the study having the same ID so it's probably best to re-run pump with the proper study name and not mix sample and study names.
The second thing is, when running/debugging the test dataset, it's probably best to stick with the suggested default project_name:project_id of sra:101
instead of examplefiles:110
in the unifier command.
Hi @ChristopherWilks ,
Thanks for getting back to me! All re-run with the above and I'm afraid we have the same error.
+ echo 'Running QC stats collection'
Running QC stats collection
+ python3 /recount-unify/log_qc/parse_logs_for_qc.py --incoming-dir links --sample-mapping /container-mounts/working/ids.tsv --intron-sums intron_counts_summed.tsv
++ cat qc_1.tsv
++ wc -l
+ num_samples_qc=1
+ num_samples_qc=0
+ [[ 0 -ne 1 ]]
+ echo 'FAILURE in pump output QC stats collection (post gene/exon unify), # QC rows (0) != # samples (1)'
FAILURE in pump output QC stats collection (post gene/exon unify), # QC rows (0) != # samples (1)
+ exit -1
INFO: Cleaning up image...
Hello,
I'm trying a test set of two local samples and failing at the final steps of the Unify pipeline!
The image is pulled by
singularity pull docker://quay.io/broadsword/recount-unify:latest
on singularitysingularity-3.8.5
Last few lines of running:
> Fri Feb 4 16:23:44 2022] > Finished job 3. > 22 of 24 steps (92%) done > + sample_id_file=/container-mounts/working/ids.tsv > + sample_original_metadata_file=/container-mounts/working/sample_metadata.tsv > + cat /container-mounts/working/sample_metadata.tsv > + perl -ne 'BEGIN { open(HDR,">/container-mounts/working/ids.tsv.new_header"); open(IN,"</container-mounts/working/ids.tsv"); @ids=<IN>; close(IN); chomp(@ids); %idmap=(); map { ($study,$sample,$rid)=split(/\t/,$_); $idmap{$study."|".$sample}=$rid; } @ids; } chomp; $i++; $f=$_; @f=split(/\t/,$f); $study=shift(@f); $sample=shift(@f); $f=join("\t",@f); $f="\t$f" if(scalar @f > 0); $rail_id=$idmap{"$study|$sample"}; if(!$rail_id) { if($i==1) { print HDR "rail_id\t$sample\t$study"."$f\n"; next; } print STDERR "failed to map $study\t$sample\t$f to a rail_id, terminating\n"; close(HDR); exit(-1); } print "$rail_id\t$sample\t$study"."$f\n"; END { close(HDR); }' > failed t to a rail_id, terminating > [Fri Feb 4 16:23:44 2022] > Error in rule annotate_all_sjs: > jobid: 2 > output: junctions.bgz, junctions.bgz.tbi, junctions.sqlite, samples.tsv, lucene_indexed_numeric_types.tsv, lucene_full_standard, lucene_full_ws, samples.fields.tsv > > RuleException: > CalledProcessError in line 299 of /recount-unify/Snakefile.study_jxs: > Command ' set -euo pipefail; > if [[ "True" == "True" ]]; then > rm -f jx_sqlite_import > mkfifo jx_sqlite_import > sqlite3 junctions.sqlite < /recount-unify/snaptron/deploy/snaptron_schema.sql > cat all.sjs.motifs.merged.tsv | pypy /recount-unify/annotate/annotate_sjs.py --compiled-annotations /container-mounts/ref/annotated_junctions.tsv.gz --motif-correct --compilation-id 101 | tee jx_sqlite_import | bgzip -@ 7 > junctions.bgz & > sqlite3 junctions.sqlite -cmd '.separator " "' ".import ./jx_sqlite_import intron" > sqlite3 junctions.sqlite < /recount-unify/snaptron/deploy/snaptron_schema_index.sql > else > cat all.sjs.motifs.merged.tsv | pypy /recount-unify/annotate/annotate_sjs.py --compiled-annotations /container-mounts/ref/annotated_junctions.tsv.gz --motif-correct --compilation-id 101 | bgzip -@ 7 > junctions.bgz > touch junctions.sqlite > fi > tabix -s2 -b3 -e4 junctions.bgz > if [[ "True" == "True" ]]; then > /recount-unify/scripts/join_railID_to_sample_metadata.sh /container-mounts/working/ids.tsv /container-mounts/working/sample_metadata.tsv | sort -t' ' -k1,1n > sorted_samples.tsv > cat /container-mounts/working/ids.tsv.new_header sorted_samples.tsv > samples.tsv > /recount-unify/snaptron/deploy/build_lucene_indexes.sh samples.tsv all > else > touch samples.tsv lucene_indexed_numeric_types.tsv samples.fields.tsv > mkdir -p lucene_full_standard lucene_full_ws > fi ' returned non-zero exit status 255. > File "/recount-unify/Snakefile.study_jxs", line 299, in __rule_annotate_all_sjs > File "/opt/conda/envs/recount-unify/lib/python3.9/concurrent/futures/thread.py", line 52, in run > Removing output files of failed job annotate_all_sjs since they might be corrupted: > junctions.bgz, junctions.bgz.tbi, junctions.sqlite > Shutting down, this might take some time. > Exiting because a job execution failed. Look above for error message
Sample metadata file as:
study_id sample_id mirytest DP9 mirytest DP8
What I did notice is that ids.tsv.new_header file looks like (below), with a return between sample_id and study_id.
rail_id sample_id study_id
Run folder looks like this at the end with the sums etc populated:
all.exon_bw_count.pasted.gz blank_exon_sums ids.input M023.gene_sums.snaptron_header.tsv setup_links.run all.exon_bw_count.pasted.gz.pre_existing ERCC.gene_sums.tsv ids.input.group_counters M023.gene_sums.tsv SIRV.gene_sums.tsv all.exon_bw_count.pasted.gz.samples_header ERCC.gene_sums.tsv.gz ids.tsv M023.gene_sums.tsv.gz SIRV.gene_sums.tsv.gz all.exon_counts.rejoined.tsv.gz exon_checks.tsv ids.tsv.new_header patroller_find_done.tsv sorted_samples.tsv all.exon_counts.rejoined.tsv.gz.accession_header exon_sums.annotation_splits.jobs ids.tsv.num_samples_per_study.tsv qc_1.tsv staging all.exon_counts.rejoined.tsv.gz.coords exon_sums_per_study ids.tsv.studies qc.err staging_jxs all.gene_counts.rejoined.tsv.gz exon_sums.study_splits.jobs input_from_pump recount-unify_latest.sif unique.exon_bw_count.pasted.gz all.intron_counts.rejoined.tsv.gz gene_checks.tsv intron_counts_summed.tsv recount-unify.output.jxs.txt unique.exon_bw_count.pasted.gz.pre_existing all_logs_and_tsvs.pkl gene_sums_per_study intron_counts_summed.tsv.err recount-unify.output.sums.txt all.logs.tar.gz gene_sums.splits.ERCC.jobs junction_counts_per_study recount-unify.sums.stats.json all.sjs.motifs.merged.tsv gene_sums.splits.M023.jobs jx_sqlite_import run_unify.sh assign_compilation_ids.py.errs gene_sums.splits.SIRV.jobs links sample_metadata.tsv
Other than the above observation - a little bit lost as to how to fix this. Thank you!
Hi there, quick question what happened to this issue, did you solve it or did it just disappear ? We are getting the same error and are kind of stuck. Especially since I'm uncertain if this is caused by a bug or by the students not executing the code properly.
Cheers
Hi @CallMeMisterOwl and @mdrnao
I don't think this is a bug as others have been successful at running the test sample.
Another thing to change in your command line, which I missed earlier is the reference, which should be hg38
rather than grcm38
, as SRR390728 is a human sample. It's also possible Singularity 3.8.5 is an issue, most of our runs were done on Singularity 2.6, though that's a long shot. Beyond that I'm not sure what the problem is.
As far as further troubleshooting your runs, I can't promise anything as there's no on going, dedicated support for monorail-external---I'm no longer at Hopkins and there's no replacement person to maintain this currently, though that may change in future.
Hi @ChristopherWilks
So, I deleted everything, re-downloaded fresh and tried again carefully correcting as per your suggestions but stuck on the same error. I appreciate the attempts to help up to this point, despite you being in a new post! My group have just filled a software engineer post so I'll hand this off to them when they arrive as a trial by fire. I'll update you if we get to a solution. We're really keen to get this working!
@CallMeMisterOwl Considering I'm stuck, I wouldn't pay much notice as to how I got around the first error! If you're still stuck and want me to - I can go through my logs to check?
All the best, Holly
@CallMeMisterOwl Considering I'm stuck, I wouldn't pay much notice as to how I got around the first error! If you're still stuck and want me to - I can go through my logs to check?
Hi @mdrnao I think that won't be necessary. We fixed it, seems like the sample_metadata.tsv was somehow corrupted. Thank you for the offer though and good luck with your issue, hope you can solve it.
Thanks for your help @ChristopherWilks, I've fixed the problem.
Looks like alias issues when collating the QC stats files (and any alias which goes near a cifs mounted drive, even with the symblinks flag).
All the best, Holly
great, glad to hear it's working now, thanks for posting the follow-up @mdrnao
Hello,
I'm trying a test set of two local samples and failing at the final steps of the Unify pipeline!
The image is pulled by
singularity pull docker://quay.io/broadsword/recount-unify:latest
on singularitysingularity-3.8.5
Last few lines of running:
Sample metadata file as:
What I did notice is that ids.tsv.new_header file looks like (below), with a return between sample_id and study_id.
Run folder looks like this at the end with the sums etc populated:
Other than the above observation - a little bit lost as to how to fix this. Thank you!