Available spacing and downsampling factors of input TCGA files

yann-rdgz commented 4 months ago

Hello ! I tried to predict on TCGA cohorts I have this warning :

/workspace/env/hooknet/lib/python3.9/site-packages/wholeslidedata/image/wholeslideimage.py:78: UserWarning: spacing 0.5 outside margin (0.3%) for [0.2277, 0.9108, 3.6432, 14.5728], returning closest spacing: 0.2277

Then algorithm won't work properly, using mpp ~0.25 instead of 0.5.

I tested the prediction on the same slide as provided in the example, but using the original .svs file from TCGA-LUSC :

example slide: TCGA-18-3406-01Z-00-DX1.tif
slide tested: TCGA-18-3406-01Z-00-DX1.8D07F006-425C-4724-BBB3-5BA099401234.svs

Using original .svs slide, I got the error:

  File "/workspace/pathology-hooknet-tls/hooknettls/__main__.py", line 8, in <module>
    objects = build_config(config_reader.read()["default"])
  File "/workspace/env/hooknet/lib/python3.9/site-packages/dicfg/factory.py", line 124, in build_config
    return _ObjectFactory(deepcopy(config)).build_config()
  File "/workspace/env/hooknet/lib/python3.9/site-packages/dicfg/factory.py", line 26, in build_config
    return self._build(self._configuration)
  File "/workspace/env/hooknet/lib/python3.9/functools.py", line 938, in _method
    return method.__get__(obj, cls)(*args, **kwargs)
  File "/workspace/env/hooknet/lib/python3.9/site-packages/dicfg/factory.py", line 38, in _build_dict
    config[key] = self._build_object(value)
  File "/workspace/env/hooknet/lib/python3.9/site-packages/dicfg/factory.py", line 67, in _build_object
    return attribute(*args, **kwargs)
  File "/workspace/env/hooknet/lib/python3.9/site-packages/wholeslidedata/iterators/patchiterator.py", line 46, in create_patch_iterator
    commander = commander_class(
  File "/workspace/env/hooknet/lib/python3.9/site-packages/wholeslidedata/buffer/patchcommander.py", line 41, in __init__
    wsi = WholeSlideImage(image_path, backend=backend)
  File "/workspace/env/hooknet/lib/python3.9/site-packages/wholeslidedata/image/wholeslideimage.py", line 35, in __init__
    self._backend = get_backend(backend)(path=self._path)
  File "/workspace/env/hooknet/lib/python3.9/site-packages/wholeslidedata/interoperability/asap/backend.py", line 15, in __init__
    raise ValueError(f"cant open image {path}")
ValueError: cant open image /opt/storage/nfs/raw/TCGA_BLCA/histology/raw/parafine/d09680b0-88fa-4814-99a7-c034d7fc43d5/TCGA-2F-A9KO-01Z-00-DX1.195576CF-B739-4BD9-B15B-4A70AE287D3E.svs

So changed the backend: asap to openslide to be able to open the file and I get the warning about spacing above!

I looked at the metadata of your .tif vs original one and they seem to have different downsampling factors wit level 0 with at a different mpp (spacing)

TCGA-18-3406-01Z-00-DX1.tif:

mpp level 0: ~0.5?
dowsampling_factors: (1.0, 2.0, 4.000091157702826, 8.000182315405652, 16.002821490498217, 32.005642980996434, 64.03464895027463, 128.16295479217007))

TCGA-18-3406-01Z-00-DX1.8D07F006-425C-4724-BBB3-5BA099401234.svs

mpp level 0: 0.2277
downsampling factors: (1.0, 4.000209951974013, 16.003504577559966, 32.01767357112908))

So my question was, to be able to predict did you transform all TCGA (BLCA, KIRC, LUSC) input .svs slides to have different downsampling factors inside .tif to have mpp=0.5 available ?

Thanks a lot for your help !! 🙏

martvanrijthoven commented 4 months ago

Dear Yann,

Yes, indeed, we resampled all the slides such that ~0.5um/px and ~2.0um/px become available, which are the mpps the algorithm requires for accurate predictions. Unfortunately , I haven't yet developed an on-the-fly resampling feature. However, I am actively working on this and hope to have it available shortly. In the interim, it is necessary to have these specific mpps available within the slides. I recommend using tools like pyvips or ASAP for converting to TIFF format, which ensures the slides are resampled and includes the required mpps. Please let me know if I can be of any assistance here.

Best wishes, Mart

yann-rdgz commented 4 months ago

Thank you for your precisions it helps a lot ! thank you for the support !

DIAGNijmegen / pathology-hooknet-tls

Available spacing and downsampling factors of input TCGA files #3