AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
129 stars 19 forks source link

Dataset request SRP051309 #2190

Open jaclyn-taroni opened 4 years ago

jaclyn-taroni commented 4 years ago

Context

A user requested SRP051309

Problem or idea

SRP051309 is a human RNA-seq data set comprised of 2 samples. No downloadable samples are available from refine.bio.

Solution or next step

First step - what is the failure reason?

arielsvn commented 4 years ago

Looks like Salmon succeded for both samples, but we haven't executed tximport on the experiment.

FYI: clicking on a sample accession code displays a page with all jobs that were executed on that sample and their failure reasons.

jaclyn-taroni commented 4 years ago

Note to self / other data science team members: below the sample details table, collapsed under Debug Information.

Looking here: https://www.refine.bio/samples/SRR1737731

There are failures 5 months after the most recent instance of success

2229363 | SALMON (32768) | 1 | no |   | 04/19/2019 15:18 | (no start_time) (no end_time)
Sample has a good computed file, it must have been processed, so it doesn't need to be downloaded! Aborting!

5 months later


4660124 | SALMON (12288) | 0 | no | v1.26.10-hotfix | 09/11/2019 20:35 | 09/11/2019 20:37 (8 minutes)
Shell  call to salmon failed because: ### salmon (mapping-based) v0.13.1 ### [ program ] => salmon  ### [ command ] => quant  ### [ libType ] => { A } ### [ index ] => {  /home/user/data_store/TRANSCRIPTOME_INDEX/HOMO_SAPIENS/short } ### [ mates1 ] => { /tmp/alpha } ### [ mates2 ] => { /tmp/beta } ### [ threads ] => { 16 } ### [ output ] => {  /home/user/data_store/processor_job_4660124/SRR1737731_output/ } ### [ seqBias ] => { } ### [ dumpEq ] => { } ### [ writeUnmappedNames ] => { } Logs will be written to  /home/user/data_store/processor_job_4660124/SRR1737731_output/logs [2019-09-12 00:43:03.441] [jointLog] [info] Fragment incompatibility  prior below threshold.  Incompatible fragments will be ignored. [2019-09-12 00:43:03.441] [jointLog] [warning]   NOTE: It appears you are running salmon without the `--validateMappings`  option. Mapping validation can generally improve both the sensitivity and  specificity of mapping, with only a moderate increase in use of computational resources.  Mapping validation is planned to become a default option (i.e. turned on  by default) in the next release of salmon. Unless there is a specific reason to do this (e.g. testing on clean  simulated data), `--validateMappings` is generally recommended.  [2019-09-12 00:43:03.441] [jointLog] [info] parsing read library format [2019-09-12 00:43:03.441] [jointLog] [info] There is 1 library. [2019-09-12 00:43:03.532] [jointLog] [info] Loading Quasi index [2019-09-12 00:43:03.532] [jointLog] [info] Loading 32-bit quasi index [2019-09-12 00:43:03.532] [stderrLog] [info] Loading Suffix Array  [2019-09-12 00:43:04.722] [stderrLog] [info] Loading Transcript Info  [2019-09-12 00:43:05.068] [stderrLog] [info] Loading Rank-Select Bit  Array [2019-09-12 00:43:05.281] [stderrLog] [info] There were 189440 set bits  in the bit array [2019-09-12 00:43:05.479] [stderrLog] [info] Computing transcript  lengths [2019-09-12 00:43:05.479] [stderrLog] [info] Waiting to finish loading  hash [2019-09-12 00:43:15.546] [stderrLog] [info] Done loading index [2019-09-12 00:43:15.546] [jointLog] [info] done [2019-09-12 00:43:15.546] [jointLog] [info] Index contained 189440  targets     [2019-09-12 00:43:15.927] [jointLog] [warning] salmon was only able to  assign 0 fragments to transcripts in the index, but the minimum number  of required assigned fragments (--minAssignedFrags) was 10. This could  be indicative of a mismatch between the reference and sample, or a very  bad sample.  You can change the --minAssignedFrags parameter to force  salmon to quantify with fewer assigned fragments (must have at least 1).

Could this be related to reprocessing using a later version of Salmon?