Closed muhanadz closed 2 years ago
Thanks @muhanadz. This is now staged for review at https://github.com/ome/omero-metadata/blob/0cc977f9c4d2309b8712f2ef20ad93cba68295f2/README.rst. As mentioned above, next step is to get a sign off from the team and decide on the versioning/roadmap.
Tested the examples listed for the automatic column detection (Project, Dataset, Screen and Image (adding roi table)). Worked fine. Happy to test the new @muhanadz 's examples too.
Reading the latest version of the README:
# header
row allowing to override the default behavior by encoding the target column types in the CSV directly# header
row with a note mentioning that removing this row should result in the same outcomeAt least to me, this reads like we are giving mixed messages about how users should structure their data.
How do people feel about revert the paradigm and remove the # header
row from the CSV snippets? This means the examples would describe the outcome of the default behavior. In addition, each example could include a note stating that the same behavior could be achieved by adding the custom # header ...
row at the top of the CSV.
How do people feel about revert the paradigm and remove the # header row from the CSV snippets?
The downside of this is that then we will need examples showing how the # header
should look like. Unfortunately, this is not trivial and the users kept asking about it. I think a very simple example listed on the top could show the behavour with and without # header
, elucidating the (rare?) situations where the # header
might be wanted/needed, rather than summarily removing all the # header
examples ?
I think we should go with @sbesson suggestion: "each example could include a note stating that the same behavior could be achieved by adding the custom # header ... row at the top of the CSV". Then we don't lose that info.
I'm not sure we have any use cases where we think the header is needed, except "you're not using the latest version of the omero-metadata
" which is a possibility. Maybe that should be mentioned in the README?
We need to update the Populate_Metadata.py
script, to take advantage of the auto header detection and/or warn if not using a version that supports it.
I'm not sure we have any use cases where we think the header is needed, except "you're not using the latest version of the omero-metadata" which is a possibility.
It is certainly my hope that @muhanadz's work is making the manual column annotation unnecessary in the majority of the cases. But I would not discard special cases where the column types need to be specified manually.
Once this PR is approved and merged, my vote is to get a 0.11.0
release out in order for both ourselves and the community to give the new functionality a try and provide feedback.
I followed @sbesson 's suggestion by reverting the paradigm for the examples. I do believe this read better but happy to revert the changes if necessary.
We will probably have to revisit the README and change the wording when more changes are made on the paths of the code.
Couple of suggestions, but looks good otherwise.
Final read-through, looking good - couple of comments. Also just noticed Copyright: "2018-2021, The Open Microscopy Environment" could be updated.
Added final changes from comments and suggestions (thank you all). Made some changes to the object-types table to address the conversation here: https://github.com/ome/omero-metadata/pull/71#discussion_r887379390
Sorry for the delay. Like the text, just one missing dot https://github.com/ome/omero-metadata/pull/71/files#r902463366 spotted. After accepting the suggestion https://github.com/ome/omero-metadata/pull/71/files#r902463366 all looks good to me here.
Thanks all for the time spent on this readme. Lots of back n forth but i think in addition to introducing the new detection, the readme now clarifies several behaviors of the plugin.
From my side the only blocker to releasing 0.11.0 is the scenario of csv files with sparse numerical data which is currently broken when using the default command (see #76). #77 proposes a fix for this regression by adding a new parameter in the HeaderDetection API. @will-moore @muhanadz could you take a look at this and see if the proposal makes sense?
Updating README.rst to include the changes to the default behavior from https://github.com/ome/omero-metadata/pull/67.