AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
129 stars 19 forks source link

QN target for EQUUS_CABALLUS is missing s3_key and s3_bucket #1931

Open kurtwheeler opened 4 years ago

kurtwheeler commented 4 years ago

Context

A compendia job failed because of:

2019-11-20 10:15:47,107 i-0c6846d48289f7c37 [volume: -1] data_refinery_workers.processors.utils ERROR [processor_job: 29458137] [no_retry: False]: Unhandled exception caught while running pr
ocessor function _perform_imputation in pipeline:
Traceback (most recent call last):
  File "/home/user/data_refinery_workers/processors/utils.py", line 369, in run_pipeline
    last_result = processor(last_result)
  File "/home/user/data_refinery_workers/processors/create_compendia.py", line 364, in _perform_imputation
    job_context = smashing_utils.quantile_normalize(job_context, ks_check=False)
  File "/home/user/data_refinery_workers/processors/smashing_utils.py", line 492, in quantile_normalize
    qn_target_path = organism.qn_target.computedfile_set.latest().sync_from_s3()
  File "/usr/local/lib/python3.5/dist-packages/data_refinery_common/models/models.py", line 1085, in sync_from_s3
    raise ValueError('Tried to download a computed file with no s3_bucket or s3_key')
ValueError: Tried to download a computed file with no s3_bucket or s3_key

Problem or idea

That's coming from https://github.com/AlexsLemonade/refinebio/blob/dev/common/data_refinery_common/models/models.py#L1085, which means that the file never got uploaded to S3. (Which is different than an older issue we had where some files had s3_key and s3_bucket but the file was missing in S3).

Solution or next step

I think the resolution to this is probably to just recreate the QN target and retrigger the compendia for EQUUS_CABALLUS. However, if that doesn't Just Work :tm: then additional digging may be required.

cgreene commented 4 years ago

Also make sure the compendia job failed if the upload failed.