ome / omero-metadata

OMERO plugin for metadata manipulation https://www.openmicroscopy.org/omero/
GNU General Public License v2.0
7 stars 13 forks source link

Only perform header detection when --file is specified #80

Closed sbesson closed 1 year ago

sbesson commented 1 year ago

This PR attempts to fix a regression introduced as part of https://github.com/ome/omero-metadata/pull/67 which was primarily tested with the omero metadata populate command

To reproduce, run a workflow composed of a table population followed by a bulk annotation -> map annotation population, for instance as described with https://omero-guides.readthedocs.io/en/latest/upload/docs/metadata.html.

With the current release of omero-metadata 0.11.1, the second step should fail with and error of type

$ omero metadata populate --context bulkmap --cfg simple-annotation-bulkmap-config.yml Dataset:601
Using session for import.user@localhost:4064. Idle timeout: 10 min. Current group: Demo Group
Traceback (most recent call last):
  File "/opt/omero/OMERO.venv/bin/omero", line 8, in <module>
    sys.exit(main())
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/omero/main.py", line 126, in main
    rv = omero.cli.argv()
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/omero/cli.py", line 1787, in argv
    cli.invoke(args[1:])
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/omero/cli.py", line 1225, in invoke
    stop = self.onecmd(line, previous_args)
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/omero/cli.py", line 1302, in onecmd
    self.execute(line, previous_args)
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/omero/cli.py", line 1384, in execute
    args.func(args)
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/omero_metadata/cli.py", line 578, in populate
    first_row = pd.read_csv(args.file, nrows=1, header=None)
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/pandas/util/_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 950, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 605, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1442, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1735, in _make_engine
    self.handles = get_handle(
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/pandas/io/common.py", line 713, in get_handle
    ioargs = _get_filepath_or_buffer(
  File "/opt/omero/OMERO.venv/lib/python3.8/site-packages/pandas/io/common.py", line 451, in _get_filepath_or_buffer
    raise ValueError(msg)
ValueError: Invalid file path or buffer object type: <class 'NoneType'>

With this PR included, both population steps should successfully complete.

Given the regression, I would propose to schedule this in an upcoming patch release 0.11.2 (possibly together with #79)

pwalczysko commented 1 year ago

Confirming the error with 0.11.2, see below

``` omero metadata populate --context bulkmap --cfg ~/Downloads/simple-annotation-bulkmap-config.yml --batch 100 Dataset:117 Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: group1 Traceback (most recent call last): File "/Users/pwalczysko/opt/anaconda3/envs/cli/bin/omero", line 10, in sys.exit(main()) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/omero/main.py", line 126, in main rv = omero.cli.argv() File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/omero/cli.py", line 1787, in argv cli.invoke(args[1:]) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/omero/cli.py", line 1225, in invoke stop = self.onecmd(line, previous_args) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/omero/cli.py", line 1302, in onecmd self.execute(line, previous_args) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/omero/cli.py", line 1384, in execute args.func(args) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/omero_metadata/cli.py", line 578, in populate first_row = pd.read_csv(args.file, nrows=1, header=None) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/pandas/util/_decorators.py", line 211, in wrapper return func(*args, **kwargs) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper return func(*args, **kwargs) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 950, in read_csv return _read(filepath_or_buffer, kwds) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 605, in _read parser = TextFileReader(filepath_or_buffer, **kwds) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1442, in __init__ self._engine = self._make_engine(f, self.engine) File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/pandas/io/parsers/readers.py", line 1735, in _make_engine self.handles = get_handle( File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/pandas/io/common.py", line 713, in get_handle ioargs = _get_filepath_or_buffer( File "/Users/pwalczysko/opt/anaconda3/envs/cli/lib/python3.8/site-packages/pandas/io/common.py", line 451, in _get_filepath_or_buffer raise ValueError(msg) ValueError: Invalid file path or buffer object type: ```
pwalczysko commented 1 year ago

With this PR, I get

omero metadata populate --context bulkmap --cfg ~/Downloads/simple-annotation-bulkmap-config.yml --batch 100 Dataset:117
Using session for user-1@localhost:4064. Idle timeout: 10 min. Current group: group1
INFO:omero_metadata.populate:Created/linked 100 MapAnnotations (total 100)
INFO:omero_metadata.populate:Created/linked 65 MapAnnotations (total 165)

and indeed, the MapAnnotations are created as expected:

Screenshot 2023-01-05 at 15 12 35

LGTM

sbesson commented 1 year ago

Pushed another commit with the changelog entry if we are happy with an immediate 0.11.2 @jburel @joshmoore

sbesson commented 1 year ago

@jburel do you have any timeline for making a release of this plugin (and/or would you like me to do it)? I think it would be valuable to upgrade the version of the plugin deployed on prod114 so that we can test the whole annotation workflow with the new pandas functionality on the next studies. /cc @dominikl

jburel commented 1 year ago

@sbesson your suggestion to deploy the plugin on prod114 makes sense. If you can take care of the release, that will be great