HumanCellAtlas / ingest-central

Ingest Central is the hub repository for the ingest service
Apache License 2.0
0 stars 1 forks source link

Throw warning when spreadsheet metadata does not get converted to JSON #484

Open malloryfreeberg opened 5 years ago

malloryfreeberg commented 5 years ago

Please discuss with a member of the ingest development team before submitting a suggestion

  1. What problem does the suggested enhancement solve? Please describe.

It is dangerous when metadata in a spreadsheet does not make it into the JSON but the submitter is not made aware of this. For example, for a recently ingested dataset in production, the Publications tab was converted to JSON but not actually incorporated into the project.json document because the tab name "Project - Publication" was not labeled as expected by ingest ("Project - Publications"). The tab was converted to JSON, but not actually added to the project JSON.

The result was that the publication metadata was not submitted to the prod data store, but the submitter (wrangler) was not made aware that publication metadata was omitted.

  1. What type of enhancement is this?

Performance, usability

  1. How much benefit do you estimate this enhancement will provide? (High, Medium, Low)

High (we shouldn't omit metadata!)

  1. Please describe a solution

I do not have a technical solution, but I would like to see some sort of warning if any metadata from the spreadsheet does not end up in the final JSON to be submitted.

rdgoite commented 5 years ago

The importer, at the time of writing, works by removing any fields that don't match with the information provided on the module worksheet title. This is done because worksheets are processed in a generic manner such that all worksheets (regardless of whether they're for submittable or module types) are processed like they're submittable. This results in module JSON initially containing elements that don't really belong in them such as the schema URL in the describedBy field.

justincc commented 5 years ago

This has come up before #35

justincc commented 5 years ago

The wrong tab name "Project - Publication" may be due to a bug in generating spreadsheet templates.

MightyAx commented 5 years ago

If the spreadsheet parser could also display warnings as well as error's we could report a warning for any data object that isn't transferred into JSON.

MightyAx commented 5 years ago

As of ingest-client#52 we now throw errors when metadata does not get converted to JSON. But will keep this ticket open to suggest that some errors should actually be warnings under but this should wait until the UI can display errors (and warnings) in a more meaningful way that a list at the top of the page (Perhaps an error tab)