Open pabloacera opened 7 months ago
thanks for the report @pabloacera. This looks like an inconsistency introduced by additional checks I added a while after I supported the SKIP_SUMS flag, I'll take a look further and see what the fix should be.
Alright thanks a lot! please let me know any updates.
ok, @pabloacera please try this quick fix update to the unifier image 1.1.2rc:
https://quay.io/repository/broadsword/recount-unify?tab=tags
Hi,
Thanks for the quick response. I got a slightly different error at the end as well:
command:
export SKIP_SUMS=1 && sudo -E /bin/bash ./singularity/run_recount_unify.sh /mnt/data/paceramateos/monorail-external-master/recount-unify_1.1.2rc.sif hg38 /mnt/data/paceramateos/monorail-external-master /mnt/data/paceramateos/monorail-external-master/unify_SMC_out/ /mnt/data/paceramateos/monorail-external-master/output /mnt/data/paceramateos/monorail-external-master/metadata.tsv 2 unify_SMC_out:101
[Mon Feb 12 06:26:59 2024]
Finished job 0.
409 of 409 steps (100%) done
Complete log: /container-mounts/working/.snakemake/log/2024-02-12T061619.075216.snakemake.log
++ fgrep 'steps (100%) done' recount-unify.output.jxs.txt
+ done='407 of 409 steps (100%) done
408 of 409 steps (100%) done
409 of 409 steps (100%) done'
+ [[ -z 407 of 409 steps (100%) done
408 of 409 steps (100%) done
409 of 409 steps (100%) done ]]
+ [[ ! -z 1 ]]
+ mkdir -p temp_jxs
+ mv junction_counts_per_study/02 junction_counts_per_study/05 junction_counts_per_study/08 junction_counts_per_study/09 junction_counts_per_study/10 junction_counts_per_study/11 junction_counts_per_study/12 junction_counts_per_study/13 junction_counts_per_study/25 junction_counts_per_study/32 junction_counts_per_study/34 junction_counts_per_study/56 junction_counts_per_study/57 junction_counts_per_study/58 junction_counts_per_study/59 junction_counts_per_study/60 junction_counts_per_study/61 junction_counts_per_study/62 junction_counts_per_study/64 junction_counts_per_study/67 junction_counts_per_study/70 junction_counts_per_study/73 junction_counts_per_study/76 junction_counts_per_study/79 junction_counts_per_study/82 junction_counts_per_study/85 temp_jxs/
+ mv junction_counts_per_study junction_counts_per_study_run_files
+ mv temp_jxs junction_counts_per_study
+ num_expected=162
++ find junction_counts_per_study -name '*.gz' -size +0c
++ wc -l
+ num_jx_files=162
+ [[ 162 -ne 162 ]]
+ cat qc_1.tsv
+ perl /recount-unify/log_qc/add_jx_stats2qc.pl samples.tsv
cat: qc_1.tsv: No such file or directory
These are the output files in the output folder, just in case it helps
(base) paceramateos@ausy3presana01:/mnt/data/paceramateos/monorail-external-master/unify_SMC_out$ ll
total 3001876
drwxr-xr-x 10 root root 12288 Feb 12 06:26 ./
drwxr-xr-x 20 root root 4096 Feb 12 06:08 ../
-rw-r--r-- 1 root root 102152995 Feb 12 06:26 all.sjs.motifs.merged.tsv
-rw-r--r-- 1 root root 0 Feb 12 06:16 assign_compilation_ids.py.errs
-rw-r--r-- 1 root root 4316278 Feb 12 06:16 blank_exon_sums
-rw-r--r-- 1 root root 648 Feb 12 06:16 ids.input
-rw-r--r-- 1 root root 405 Feb 12 06:16 ids.input.group_counters
-rw-r--r-- 1 root root 837 Feb 12 06:16 ids.tsv
-rw-r--r-- 1 root root 28 Feb 12 06:26 ids.tsv.new_header
-rw-r--r-- 1 root root 378 Feb 12 06:16 ids.tsv.num_samples_per_study.tsv
-rw-r--r-- 1 root root 324 Feb 12 06:16 ids.tsv.studies
drwxr-xr-x 28 root root 4096 Feb 12 06:16 input_from_pump/
drwxr-xr-x 28 root root 4096 Feb 12 06:26 junction_counts_per_study/
drwxr-xr-x 2 root root 32768 Feb 12 06:26 junction_counts_per_study_run_files/
-rw-r--r-- 1 root root 38522177 Feb 12 06:26 junctions.bgz
-rw-r--r-- 1 root root 240231 Feb 12 06:26 junctions.bgz.tbi
-rw-r--r-- 1 root root 287346688 Feb 12 06:26 junctions.sqlite
prw-r--r-- 1 root root 0 Feb 12 06:26 jx_sqlite_import|
-rw-r--r-- 1 root root 1130 Feb 12 06:26 jx_stats_per_sample.tsv
drwxr-xr-x 28 root root 4096 Feb 12 06:16 links/
drwxr-xr-x 2 root root 4096 Feb 12 06:26 lucene_full_standard/
drwxr-xr-x 2 root root 4096 Feb 12 06:26 lucene_full_ws/
-rw-r--r-- 1 root root 101 Feb 12 06:26 lucene_indexed_numeric_types.tsv
-rw-r--r-- 1 root root 8789 Feb 12 06:26 lucene.indexer.run
-rw-r--r-- 1 root root 668 Feb 11 09:02 metadata.tsv
-rw-r--r-- 1 root root 0 Feb 12 06:26 qc_2.tsv
-rw-r--r-- 1 root root 0 Feb 12 06:26 qc.err
-rw-r--r-- 1 root root 252098 Feb 12 06:26 recount-unify.jxs.stats.json
-rw-r--r-- 1 root root 325020 Feb 12 06:26 recount-unify.output.jxs.txt
-rw-r--r-- 1 root root 95 Feb 12 06:26 samples.fields.tsv
-rw-r--r-- 1 root root 1798 Feb 12 06:26 samples.tsv
-rw-r--r-- 1 root root 58 Feb 12 06:26 samples.tsv.inferred
-rw-r--r-- 1 root root 1533 Feb 12 06:16 setup_links.run
drwxr-xr-x 9 root root 4096 Feb 12 06:16 .snakemake/
-rw-r--r-- 1 root root 837 Feb 12 06:26 sorted_samples.tsv
-rw-r--r-- 1 root root 2521489 Feb 12 06:19 SRR11085164.all.mm
-rw-r--r-- 1 root root 43443959 Feb 12 06:19 SRR11085164.all.RR
-rw-r--r-- 1 root root 2516681 Feb 12 06:18 SRR11085164.unique.mm
-rw-r--r-- 1 root root 43443959 Feb 12 06:18 SRR11085164.unique.RR
-rw-r--r-- 1 root root 2476586 Feb 12 06:17 SRR11085167.all.mm
-rw-r--r-- 1 root root 42568172 Feb 12 06:17 SRR11085167.all.RR
-rw-r--r-- 1 root root 2472201 Feb 12 06:19 SRR11085167.unique.mm
-rw-r--r-- 1 root root 42568172 Feb 12 06:19 SRR11085167.unique.RR
-rw-r--r-- 1 root root 3990150 Feb 12 06:17 SRR11085170.all.mm
-rw-r--r-- 1 root root 61038154 Feb 12 06:17 SRR11085170.all.RR
-rw-r--r-- 1 root root 3981491 Feb 12 06:18 SRR11085170.unique.mm
-rw-r--r-- 1 root root 61038154 Feb 12 06:18 SRR11085170.unique.RR
-rw-r--r-- 1 root root 9254 Feb 12 06:26 SRR11085173.all.mm
-rw-r--r-- 1 root root 252960 Feb 12 06:26 SRR11085173.all.RR
-rw-r--r-- 1 root root 9254 Feb 12 06:26 SRR11085173.unique.mm
-rw-r--r-- 1 root root 252960 Feb 12 06:26 SRR11085173.unique.RR
-rw-r--r-- 1 root root 2478331 Feb 12 06:22 SRR11085176.all.mm
-rw-r--r-- 1 root root 42782559 Feb 12 06:22 SRR11085176.all.RR
-rw-r--r-- 1 root root 2473584 Feb 12 06:22 SRR11085176.unique.mm
-rw-r--r-- 1 root root 42782559 Feb 12 06:22 SRR11085176.unique.RR
-rw-r--r-- 1 root root 2534572 Feb 12 06:21 SRR11085179.all.mm
-rw-r--r-- 1 root root 43292606 Feb 12 06:21 SRR11085179.all.RR
-rw-r--r-- 1 root root 2530012 Feb 12 06:21 SRR11085179.unique.mm
-rw-r--r-- 1 root root 43292606 Feb 12 06:21 SRR11085179.unique.RR
-rw-r--r-- 1 root root 2489279 Feb 12 06:23 SRR11085182.all.mm
-rw-r--r-- 1 root root 43099373 Feb 12 06:23 SRR11085182.all.RR
-rw-r--r-- 1 root root 2484557 Feb 12 06:23 SRR11085182.unique.mm
-rw-r--r-- 1 root root 43099373 Feb 12 06:23 SRR11085182.unique.RR
-rw-r--r-- 1 root root 2463244 Feb 12 06:25 SRR11085185.all.mm
-rw-r--r-- 1 root root 42547589 Feb 12 06:25 SRR11085185.all.RR
-rw-r--r-- 1 root root 2458856 Feb 12 06:24 SRR11085185.unique.mm
-rw-r--r-- 1 root root 42547589 Feb 12 06:24 SRR11085185.unique.RR
-rw-r--r-- 1 root root 2280029 Feb 12 06:19 SRR12765356.all.mm
-rw-r--r-- 1 root root 40502777 Feb 12 06:19 SRR12765356.all.RR
-rw-r--r-- 1 root root 2274507 Feb 12 06:18 SRR12765356.unique.mm
-rw-r--r-- 1 root root 40502777 Feb 12 06:18 SRR12765356.unique.RR
-rw-r--r-- 1 root root 2158840 Feb 12 06:20 SRR12765361.all.mm
-rw-r--r-- 1 root root 38748336 Feb 12 06:20 SRR12765361.all.RR
-rw-r--r-- 1 root root 2153779 Feb 12 06:20 SRR12765361.unique.mm
-rw-r--r-- 1 root root 38748336 Feb 12 06:20 SRR12765361.unique.RR
-rw-r--r-- 1 root root 3125219 Feb 12 06:22 SRR13209902.all.mm
-rw-r--r-- 1 root root 50752180 Feb 12 06:22 SRR13209902.all.RR
-rw-r--r-- 1 root root 3120832 Feb 12 06:22 SRR13209902.unique.mm
-rw-r--r-- 1 root root 50752180 Feb 12 06:22 SRR13209902.unique.RR
-rw-r--r-- 1 root root 3273711 Feb 12 06:20 SRR13209905.all.mm
-rw-r--r-- 1 root root 52575472 Feb 12 06:20 SRR13209905.all.RR
-rw-r--r-- 1 root root 3268902 Feb 12 06:20 SRR13209905.unique.mm
-rw-r--r-- 1 root root 52575472 Feb 12 06:20 SRR13209905.unique.RR
-rw-r--r-- 1 root root 3447740 Feb 12 06:23 SRR13209908.all.mm
-rw-r--r-- 1 root root 54653228 Feb 12 06:23 SRR13209908.all.RR
-rw-r--r-- 1 root root 3442726 Feb 12 06:22 SRR13209908.unique.mm
-rw-r--r-- 1 root root 54653228 Feb 12 06:22 SRR13209908.unique.RR
-rw-r--r-- 1 root root 3356915 Feb 12 06:19 SRR13209909.all.mm
-rw-r--r-- 1 root root 53503805 Feb 12 06:19 SRR13209909.all.RR
-rw-r--r-- 1 root root 3352033 Feb 12 06:20 SRR13209909.unique.mm
-rw-r--r-- 1 root root 53503805 Feb 12 06:20 SRR13209909.unique.RR
-rw-r--r-- 1 root root 2897777 Feb 12 06:19 SRR13209910.all.mm
-rw-r--r-- 1 root root 48977776 Feb 12 06:19 SRR13209910.all.RR
-rw-r--r-- 1 root root 2893954 Feb 12 06:19 SRR13209910.unique.mm
-rw-r--r-- 1 root root 48977776 Feb 12 06:19 SRR13209910.unique.RR
-rw-r--r-- 1 root root 3207252 Feb 12 06:24 SRR13209911.all.mm
-rw-r--r-- 1 root root 52763266 Feb 12 06:24 SRR13209911.all.RR
-rw-r--r-- 1 root root 3202924 Feb 12 06:25 SRR13209911.unique.mm
-rw-r--r-- 1 root root 52763266 Feb 12 06:25 SRR13209911.unique.RR
-rw-r--r-- 1 root root 4637836 Feb 12 06:21 SRR13209912.all.mm
-rw-r--r-- 1 root root 70401722 Feb 12 06:21 SRR13209912.all.RR
-rw-r--r-- 1 root root 4632239 Feb 12 06:21 SRR13209912.unique.mm
-rw-r--r-- 1 root root 70401722 Feb 12 06:21 SRR13209912.unique.RR
-rw-r--r-- 1 root root 4495647 Feb 12 06:24 SRR13209913.all.mm
-rw-r--r-- 1 root root 68532094 Feb 12 06:24 SRR13209913.all.RR
-rw-r--r-- 1 root root 4490317 Feb 12 06:23 SRR13209913.unique.mm
-rw-r--r-- 1 root root 68532094 Feb 12 06:23 SRR13209913.unique.RR
-rw-r--r-- 1 root root 2494118 Feb 12 06:25 SRR21607157.all.mm
-rw-r--r-- 1 root root 42828910 Feb 12 06:25 SRR21607157.all.RR
-rw-r--r-- 1 root root 2489096 Feb 12 06:24 SRR21607157.unique.mm
-rw-r--r-- 1 root root 42828910 Feb 12 06:24 SRR21607157.unique.RR
-rw-r--r-- 1 root root 2228605 Feb 12 06:21 SRR21607158.all.mm
-rw-r--r-- 1 root root 39339541 Feb 12 06:21 SRR21607158.all.RR
-rw-r--r-- 1 root root 2224411 Feb 12 06:21 SRR21607158.unique.mm
-rw-r--r-- 1 root root 39339541 Feb 12 06:21 SRR21607158.unique.RR
-rw-r--r-- 1 root root 2287170 Feb 12 06:23 SRR21607159.all.mm
-rw-r--r-- 1 root root 40170022 Feb 12 06:23 SRR21607159.all.RR
-rw-r--r-- 1 root root 2282867 Feb 12 06:23 SRR21607159.unique.mm
-rw-r--r-- 1 root root 40170022 Feb 12 06:23 SRR21607159.unique.RR
-rw-r--r-- 1 root root 2105274 Feb 12 06:17 SRR21607160.all.mm
-rw-r--r-- 1 root root 38016996 Feb 12 06:17 SRR21607160.all.RR
-rw-r--r-- 1 root root 2101079 Feb 12 06:17 SRR21607160.unique.mm
-rw-r--r-- 1 root root 38016996 Feb 12 06:17 SRR21607160.unique.RR
-rw-r--r-- 1 root root 1946402 Feb 12 06:24 SRR21607161.all.mm
-rw-r--r-- 1 root root 35711495 Feb 12 06:24 SRR21607161.all.RR
-rw-r--r-- 1 root root 1942727 Feb 12 06:25 SRR21607161.unique.mm
-rw-r--r-- 1 root root 35711495 Feb 12 06:25 SRR21607161.unique.RR
-rw-r--r-- 1 root root 2024157 Feb 12 06:16 SRR21607162.all.mm
-rw-r--r-- 1 root root 36911131 Feb 12 06:16 SRR21607162.all.RR
-rw-r--r-- 1 root root 2020296 Feb 12 06:16 SRR21607162.unique.mm
-rw-r--r-- 1 root root 36911131 Feb 12 06:16 SRR21607162.unique.RR
-rw-r--r-- 1 root root 3663563 Feb 12 06:17 SRR22909625.all.mm
-rw-r--r-- 1 root root 55704549 Feb 12 06:17 SRR22909625.all.RR
-rw-r--r-- 1 root root 3657287 Feb 12 06:18 SRR22909625.unique.mm
-rw-r--r-- 1 root root 55704549 Feb 12 06:18 SRR22909625.unique.RR
-rw-r--r-- 1 root root 3382954 Feb 12 06:25 SRR22909632.all.mm
-rw-r--r-- 1 root root 52737201 Feb 12 06:25 SRR22909632.all.RR
-rw-r--r-- 1 root root 3377004 Feb 12 06:25 SRR22909632.unique.mm
-rw-r--r-- 1 root root 52737201 Feb 12 06:25 SRR22909632.unique.RR
-rw-r--r-- 1 root root 3425936 Feb 12 06:18 SRR22909634.all.mm
-rw-r--r-- 1 root root 52876009 Feb 12 06:18 SRR22909634.all.RR
-rw-r--r-- 1 root root 3419784 Feb 12 06:17 SRR22909634.unique.mm
-rw-r--r-- 1 root root 52876009 Feb 12 06:17 SRR22909634.unique.RR
drwxr-xr-x 2 root root 36864 Feb 12 06:26 staging_jxs/
A separate question is, Do you think I can still use the outputs? they seems to be all there. Is all the computation done? Thanks!
yeah, that's not too surprising, the qc file(s) it's referencing are generated as part of the sums area which gets skipped so I need to do some more work on that to make it consistent. To your 2nd question, yes, the data files themselves are fine, though not named exactly as recount3 expects them. You'd at least want to run these two find
commands to properly rename:
https://github.com/langmead-lab/recount-unify/blob/e00439ed677e262701fad2e011300c4b5763c545/workflow.bash#L322
That said, can you remind me what your overall goal is with just running the jxns (are you just interested in recount3-ready jxns OR snaptron jxns, or something else?)?
Thanks for the commands! my goal is to generate the snaptron jxns.
Hi @pabloacera
Just a followup on your last comment (sorry, been quite busy with other things recently), you really should only need these files from your output above for Snaptron (you don't need the rest which is only for recount3):
-rw-r--r-- 1 root root 38522177 Feb 12 06:26 junctions.bgz
-rw-r--r-- 1 root root 240231 Feb 12 06:26 junctions.bgz.tbi
-rw-r--r-- 1 root root 287346688 Feb 12 06:26 junctions.sqlite
-rw-r--r-- 1 root root 1130 Feb 12 06:26 jx_stats_per_sample.tsv
drwxr-xr-x 2 root root 4096 Feb 12 06:26 lucene_full_standard/
drwxr-xr-x 2 root root 4096 Feb 12 06:26 lucene_full_ws/
-rw-r--r-- 1 root root 101 Feb 12 06:26 lucene_indexed_numeric_types.tsv
-rw-r--r-- 1 root root 95 Feb 12 06:26 samples.fields.tsv
-rw-r--r-- 1 root root 1798 Feb 12 06:26 samples.tsv
Hi,
I am running unifier with junction counts only. It seems that everything goes find, and all the snakemake jobs run without a problem but right at the end there is an error. I can see all the outputs in the folder, but I am not sure if I am missing something or it is just the clean up part where the issue pops. Here is the command I used and the error. Thanks a lot!!
export SKIP_SUMS=1 && sudo -E /bin/bash ./singularity/run_recount_unify.sh /mnt/data/paceramateos/monorail-external-master/recount-unify_1.1.1.sif hg38 /mnt/data/paceramateos/monorail-external-master /mnt/data/paceramateos/monorail-external-master/test_SMC_output /mnt/data/paceramateos/monorail-external-master/output /mnt/data/paceramateos/monorail-external-master/metadata.tsv 2 test_SMC_output:101