HumanCellAtlas / metadata-schema

This repo is for the metadata schemas associated with the HCA
Apache License 2.0
65 stars 32 forks source link

log_file.json schema for analysis_process.json #228

Closed daniwelter closed 5 years ago

daniwelter commented 6 years ago

This is an AI identified during the analysis schema outbreak at the March DCP meeting in Hinxton.

  1. Which schema needs be changed? analysis_process.json, log_file.json
  1. What field(s) in that schema need to be changed/added? Add a new log_file.json schema and reference it from analysis_process.json to allow green box to add log files to the analysis bundle

  2. What should the change/addition be? new schema and updated fields

  3. Why is the change requested? Fields are currently referred to in strings, which is not consistent with the file schema approach.

malloryfreeberg commented 6 years ago

@kbergin @samanehsan @dshiga Can you weigh in on whether you guys need a log_file.json for the pilot launch? If so, we will prioritize this with our pilot launch changes. If not, we will lower the priority for this change.

dshiga commented 6 years ago

No, I think we should probably just tar up the log files we want to include in the bundle and submit that tar file as an output of the analysis, which wouldn't require a schema change.

dshiga commented 6 years ago

That said, I think the log_out and log_err fields in the analysis json will be redundant if we do it that way.

malloryfreeberg commented 5 years ago

I have not heard that this is required anymore from the pipelines team. As of now, log files are output just like other analysis files, and there are no additional metadata fields needed to describe log files. If this ever changes in the future, a new ticket can be opened.