Open malloryfreeberg opened 6 years ago
Please clarify.
@justincc issue filled out. @gabsie can provide more specific information on formatting when needed.
I think we discussed this to be part of the second data recruitment call. Shall we discuss this tomorrow at the meeting? @morrisonnorman @justincc
Sure can do. First data recruitment is the one that got us 14 datasets and 2nd is going to start imminently?
First data recruitment is the one that got us 14 datasets and 2nd is going to start imminently?
Yes
In discussion with Tony today this may be regarded as not actually critical for GA. Please let's discuss further if you disagree.
Again, data recruitment is starting in the new year. Being able to send out improved spreadsheets has significant impact on our contributors and wranglers ability to contribute and wrangler data.
A slick process to generate the improved styled spreadsheets isn't needed
A process to generate the improved styled spreadsheets is
I think this was already implemented by Dani, + Simon - and it was really quick to do this. It might be in another ticket, but I think it was done and now I need to review it, when they send me the result.
I just ran the spreadsheet template generator from master branch in ingest-client, and @gabsie suggested changes are not incorporated. I don't see an open PR or ~branch~ (maybe it's local_spreadsheet_builder?) for them, either.
Although I agree that updating this is not a hard requirement for GA, I am strongly advocating for getting this done as soon as possible given we already de-prioritized it for the cbeta. If we are going to continue to improve the data contribution process, we need to have some feedback on the new spreadsheet style as the previous style was found to be confusing and needing much improvement.
As of today, there are 2 groups that need spreadsheets ASAP, with more on the way as we hear back from contributors after the holiday break.
I believe it will not take a lot of developer time to incorporate Gab's changes. We need things like changing the order of spreadsheet rows, adding guidelines (similarly to how examples are already added), improved coloring, etc. I'm happy to take a stab at the updates, given Dani it out until Monday and I don't know when Simon is back.
Talking with @malloryfreeberg, it sounds like this could be chiefly or even solely a metadata team task rather the rest of ingest dev. Mallory is going to bring this up with @simonjupp and @daniwelter in sprint planning next Tues 8th Jan.
Remaining important issues:
.text
or .ontology
or .ontology_label
fields in the ontology schema. I think we want the actual field values here..text
field is required if the field is used. Required field annotation should depend on whether the field itself is required, not based on the .text
field being required.Seems most of the major features still needed are ontology handling-related...
Points 1-3 are now done. Points 4 + 5 require patch schema changes.
I think point 4 and 5 are going to be needed for the metadata TSV too to unblock the beta-2 phase. DataBiosphere/azul#784
Currently being tracked by HumanCellAtlas/metadata-schema#899
@simonjupp If you can tell me what the change needs to be, we can probably get those patch updates in dev quickly.
What problem does the suggested enhancement solve? Please describe.
The current version of the template metadata spreadsheet is not user-friendly.
What type of enhancement is this?
Performance, usability
How will this enhancement benefit wrangers and/or end-users?
Data contributors will have better direction when filling in a metadata spreadsheet.
How complex to implement do you estimate this enhancement will be? (High, Medium, Low)
Low
How much benefit do you estimate this enhancement will provide? (High, Medium, Low)
Medium
How urgent is this and is there a specific date it needs to be done by? (High, Medium, Low)
Medium (not needed for the beta, but needed by end of calendar year)
Describe your preferred solution
Update the template spreadsheet generator script to include features/formats supplied by @gabsie . Example of new format is attached in HCA_metadata_template.xlsx
Describe alternatives you've considered
None
Additional context
Example new formatting: HCA_metadata_template.xlsx