ome / omero-py

Python project containing Ice remoting code for OMERO
https://www.openmicroscopy.org/omero
GNU General Public License v2.0
20 stars 31 forks source link

Fix UnicodeDecodeError #378

Closed ehrenfeu closed 1 week ago

ehrenfeu commented 11 months ago

Log files like e.g. the Blitz-0.log are containing references to files being uploaded by users, i.e. there is no reasonable assumption that can be made over what people will put into their file names.

Parsing the logs e.g. via "omero admin diagnostics" will fail with a UnicodeDecodeError when hitting lines referring to such file names.

Force-setting the encoding to "utf-8" here fixes this problem for us.

ehrenfeu commented 11 months ago

For the record, the unicode sequence that was contained in the filename that triggered this issue now was \xe2\x80\x8e, but we have seen this repeatedly before.

sbesson commented 11 months ago

Thanks @ehrenfeu, do you have by any chance the full stack trace of the original exception? For reference, https://github.com/ome/omero-py/pull/236 (released in OMERO.py 5.8.0) aimed to deal at a very similar issue. The original fix also forcing the encoding to utf-8 but as we tried to cover additional scenarios, the code eventually moved to using the surrogateescapehandler error handler to decode unsupported bytes.

ehrenfeu commented 11 months ago

Thanks @sbesson - unfortunately I am already off now, but I might be able to produce a stack trace after returning from vacation. Feel free to remind me if you don't get one by the end of August.

sbesson commented 10 months ago

@ehrenfeu ping

ehrenfeu commented 2 weeks ago

Reminder (to myself): next time please put the reference to an image ID in this ticket as well, so it can be used to create an MRE log file.

Sigh. 😞

Will try to reproduce / check in the coming days...

ehrenfeu commented 1 week ago

Alright, finally I'm having an MRE log file. For the records (or unit tests...), here it is:

Blitz-0-problematic-utf8.log

Results from running omero admin diagnostics:

Thanks for the patience, I'm closing the issue!

❤️ Niko

ehrenfeu commented 1 week ago

And for the sake of completeness, as this was requested earlier, here is the full stack trace of the failing run:

Log files:  Blitz-0.log                    Traceback (most recent call last):
  File "/opt/omero/server/venv3/bin/omero", line 8, in <module>
    sys.exit(main())
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/main.py", line 125, in main
    rv = omero.cli.argv()
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/cli.py", line 1785, in argv
    cli.invoke(args[1:])
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/cli.py", line 1223, in invoke
    stop = self.onecmd(line, previous_args)
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/cli.py", line 1300, in onecmd
    self.execute(line, previous_args)
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/cli.py", line 1382, in execute
    args.func(args)
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/install/windows_warning.py", line 26, in wrapper
    return func(self, *args, **kwargs)
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/plugins/prefs.py", line 79, in open_and_close_config
    return func(*args, **kwargs)
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/plugins/admin.py", line 1445, in diagnostics
    parse_logs()
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/plugins/admin.py", line 1399, in parse_logs
    self._exists(old_div(log_dir, x))
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero/cli.py", line 1121, in _exists
    for l in p.lines(errors="surrogateescape"):
  File "/opt/omero/server/venv3/lib64/python3.6/site-packages/omero_ext/path.py", line 938, in lines
    return f.readlines()
  File "/usr/lib64/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 199: ordinal not in range(128)