bgilbert / anonymize-slide

Delete the label from a whole-slide image
GNU General Public License v2.0
57 stars 45 forks source link

Python3 #5

Open r3m0chop opened 4 years ago

jetic83 commented 4 years ago

When using this code with an mrxs file, I get following error:

C:\>python "anonymize-slide_py3.py" "test.mrxs" anonymize-slide_py3.py:285: DeprecationWarning: This method will be removed in future versions. Use 'parser.read_file()' instead. self._dat.readfp(fh) test.mrxs: startswith first arg must be bytes or a tuple of bytes, not str

The same slide test.mrxs is anonymized correctly with python 2 and the original anonymizing code.

Does mrxs, svs and ndpi work for you?

r3m0chop commented 4 years ago

Thank you very much for your feedback @jetic83 ! Unfortunately, I have only had access to SVS files to test so far, and should have emphasized that my testing was therefore limited as such.

I will look into this issue. At first glance, it does seem to require but a single line change at: https://github.com/bgilbert/anonymize-slide/blob/759eecc713add621766e3cc2e22a76d9b57e5738/anonymize-slide.py#L257 ...but I would want to further test as well since there might be subsequent changes required after all.

In the meantime, please feel free to add or suggest any modifications that you find helpful or necessary.

r3m0chop commented 4 years ago

Test data found at http://openslide.cs.cmu.edu/download/openslide-testdata/ and I have replicated your reported issue @jetic83 . Working on it now...

UPDATE: looks like each of the MRXS file handles need to be wrapped in an io.TextIOWrapper(), as per https://stackoverflow.com/questions/55118890/python3-configparser-startswith-first-arg-must-be-bytes-or-a-tuple-of-bytes-not/55119695#55119695

r3m0chop commented 4 years ago

@jetic83 - I have made a first pass update of the Python3 support for the MRXS format as well, in the above bb091b4 commit. It has only been tested with the Mirax2-Fluorescence-1 dataset, but I wanted to get this to you sooner rather than later in order to get your initial feedback. Thank you very much in advance.

jetic83 commented 4 years ago

Great, thx, I will check it out. In the meanwhile, is NDPI tested as well?

r3m0chop commented 4 years ago

While I haven't done anything specifically with NDPI, but if it is using do_hamamatsu_ndpi() (https://github.com/bgilbert/anonymize-slide/blob/master/anonymize-slide.py#L492-L505), there doesn't appear to be much of anything to do there at all. That said, it does indeed still need a test set thrown at it.

markemus commented 4 years ago

There are test images on the openslide website here: https://openslide.org/formats/ventana/ see the bottom of the page. You can navigate to the other formats for their test images.

Also wanted to let you know that I just submitted a similar PR, so we should probably coordinate our work. This update is definitely needed especially now that py2 is obsolete.

Tomatenbiss commented 3 years ago

@r3m0chop, @jetic83, I tested this script (just like the script of @markemus) with a couple of slides from my institute. I am testing the results with QuPath and 3DHistech Case Viewer where you can access the label images. In case of a successful anonymization the label image cannot be found and the macro image is cropped so the label is not visible anymore (in the original version of the script) or totally removed (in your python3 scripts).

When I use the script on .ndpi files the anonymization works fine.

The script did not work with .mrsx files (old format). An Error is encountered and the slides still contain the label. This is the output (anonymized slide names):

??????.mrxs: File contains no section headers.
file: Slidedat.ini', line: 1
'[GENERAL]\n'

However, in the next days I will have access to .mrxs files in the new format and will report the results.