Updating README - Githubissues

muhanadz commented 2 years ago

Updating README.rst to include the changes to the default behavior from https://github.com/ome/omero-metadata/pull/67.

sbesson commented 2 years ago

Thanks @muhanadz. This is now staged for review at https://github.com/ome/omero-metadata/blob/0cc977f9c4d2309b8712f2ef20ad93cba68295f2/README.rst. As mentioned above, next step is to get a sign off from the team and decide on the versioning/roadmap.

pwalczysko commented 2 years ago

Tested the examples listed for the automatic column detection (Project, Dataset, Screen and Image (adding roi table)). Worked fine. Happy to test the new @muhanadz 's examples too.

sbesson commented 2 years ago

Reading the latest version of the README:

the section starts with a thorough explanation of the default behavior which is a combined of name-based detection + pandas-based detection. The addition of tables is very welcome and a great way to capture the different scenarios.
then it defines the semantics of the # header row allowing to override the default behavior by encoding the target column types in the CSV directly
finally several examples are given. Interestingly, all examples make use the # header row with a note mentioning that removing this row should result in the same outcome

At least to me, this reads like we are giving mixed messages about how users should structure their data. How do people feel about revert the paradigm and remove the # header row from the CSV snippets? This means the examples would describe the outcome of the default behavior. In addition, each example could include a note stating that the same behavior could be achieved by adding the custom # header ... row at the top of the CSV.

pwalczysko commented 2 years ago

How do people feel about revert the paradigm and remove the # header row from the CSV snippets?

The downside of this is that then we will need examples showing how the # header should look like. Unfortunately, this is not trivial and the users kept asking about it. I think a very simple example listed on the top could show the behavour with and without # header, elucidating the (rare?) situations where the # header might be wanted/needed, rather than summarily removing all the # header examples ?

will-moore commented 2 years ago

I think we should go with @sbesson suggestion: "each example could include a note stating that the same behavior could be achieved by adding the custom # header ... row at the top of the CSV". Then we don't lose that info. I'm not sure we have any use cases where we think the header is needed, except "you're not using the latest version of the omero-metadata" which is a possibility. Maybe that should be mentioned in the README?

We need to update the Populate_Metadata.py script, to take advantage of the auto header detection and/or warn if not using a version that supports it.

sbesson commented 2 years ago

I'm not sure we have any use cases where we think the header is needed, except "you're not using the latest version of the omero-metadata" which is a possibility.

It is certainly my hope that @muhanadz's work is making the manual column annotation unnecessary in the majority of the cases. But I would not discard special cases where the column types need to be specified manually. Once this PR is approved and merged, my vote is to get a 0.11.0 release out in order for both ourselves and the community to give the new functionality a try and provide feedback.

muhanadz commented 2 years ago

I followed @sbesson 's suggestion by reverting the paradigm for the examples. I do believe this read better but happy to revert the changes if necessary.

We will probably have to revisit the README and change the wording when more changes are made on the paths of the code.

pwalczysko commented 2 years ago

Couple of suggestions, but looks good otherwise.

will-moore commented 2 years ago

Final read-through, looking good - couple of comments. Also just noticed Copyright: "2018-2021, The Open Microscopy Environment" could be updated.

muhanadz commented 2 years ago

Added final changes from comments and suggestions (thank you all). Made some changes to the object-types table to address the conversation here: https://github.com/ome/omero-metadata/pull/71#discussion_r887379390

pwalczysko commented 2 years ago

Sorry for the delay. Like the text, just one missing dot https://github.com/ome/omero-metadata/pull/71/files#r902463366 spotted. After accepting the suggestion https://github.com/ome/omero-metadata/pull/71/files#r902463366 all looks good to me here.

sbesson commented 2 years ago

Thanks all for the time spent on this readme. Lots of back n forth but i think in addition to introducing the new detection, the readme now clarifies several behaviors of the plugin.

From my side the only blocker to releasing 0.11.0 is the scenario of csv files with sparse numerical data which is currently broken when using the default command (see #76). #77 proposes a fix for this regression by adding a new parameter in the HeaderDetection API. @will-moore @muhanadz could you take a look at this and see if the proposal makes sense?

ome / omero-metadata

Updating README #71