langmead-lab / monorail-external

examples to run monorail externally
MIT License
13 stars 5 forks source link

Unifier failing due to missing junction files #12

Closed beeCwright closed 2 years ago

beeCwright commented 2 years ago

I'm running the recount-unifer on the provided sample data but the pipeline is failing with the following error in the log:

image

The pump ran correctly

Images: quay.io/broadsword/recount-unify 1.0.9 quay.io/benlangmead/recount-rs5 1.0.6

ids.tsv SRP020237 SRR390728 831056

ids.input SRP020237 SRR390728

The step it is failing on takes the following file as input. This file exists, but is empty. junction_counts_per_study/SRP020237.unique.sj.merged.motifs.annotated

Any suggestions on what could be causing this? I'm using the example fastq data files.

ChristopherWilks commented 2 years ago

hi @beeCwright, Thanks for your patience. I finally got a chance to run this myself (again) and realized that there'd been some major changes to the unifier image since I last tested the main README.

Anyway, the tl;dr version is that the test case was too small (only ~125 paired reads) to generate any splice junctions causing the 2nd part of the unifier where junctions are aggregated to fail since there are 0 junctions in the input to it.

I've updated the README to use a larger sample which I've just now tested with the versions of the images you're using and it worked as it has a number of splice junctions.

I suggest you update to the new sample, re-run pump and unifier and see if that works:

http://snaptron.cs.jhu.edu/data/temp/SRR390728_1.fastq.gz http://snaptron.cs.jhu.edu/data/temp/SRR390728_2.fastq.gz

beeCwright commented 2 years ago

Thanks! Yep those do indeed work, must have just been the previous sample data.