baidu-research / NCRF

Cancer metastasis detection with neural conditional random field (NCRF)
Apache License 2.0
753 stars 184 forks source link

slide/mask mismatch #6

Closed udion closed 6 years ago

udion commented 6 years ago

If I follow the instructions for testing as given in readme. I get the following error

Traceback (most recent call last):
  File "wsi/bin/probs_map.py", line 163, in <module>
    main()
  File "wsi/bin/probs_map.py", line 159, in main
    run(args)
  File "wsi/bin/probs_map.py", line 115, in run
    args, cfg, flip='NONE', rotate='NONE')
  File "wsi/bin/probs_map.py", line 87, in make_dataloader
    flip=flip, rotate=rotate),
  File "/media/udion/a2c5c487-f939-4b82-a348-86b3d1bdb024/udion_home/Projects/NCRF/wsi/bin/../../wsi/data/wsi_producer.py", line 42, in __init__
    self._preprocess()
  File "/media/udion/a2c5c487-f939-4b82-a348-86b3d1bdb024/udion_home/Projects/NCRF/wsi/bin/../../wsi/data/wsi_producer.py", line 55, in _preprocess
    .format(X_slide, X_mask, Y_slide, Y_mask))
Exception: Slide/Mask dimension does not match , X_slide / X_mask : 98304 / 1536, Y_slide / Y_mask : 103936 / 2048

what's the issue?

sketchplanet commented 6 years ago

Just resize the mask with proper ratio of height to width. The level_dimensions of slide does not keep a consistent ratio.

udion commented 6 years ago

mask is in npy file, do you mean using numpy resize(it will repeat elements, won't it disturb the mask)?

udion commented 6 years ago

I splitted the tiff file and processed the level 6. It works.

yil8 commented 6 years ago

@sketchplanet can you point me to which slide caused you the error, e.g. Test_XXX.tif, ? I'd be interested in taking a look.

udion commented 6 years ago

@yil8 awesome work.

I think it will be great to specify that, during test time

or

I personally found second method to work, because first method was running out of memory on titan X 12gb card

yil8 commented 6 years ago

@udion Thanks for your suggestion. By default, we I have to first generate a tissue mask before obtaining the probability heat map. And this generated tissue mask is supposed to be at higher level, e.g. level 6 by default, and therefore its size should be much smaller than the size of the raw WSI at level 0. After obtaining the tissue mask, the probability heat map is then computed based on that. I did not remember run into issues of level_dimensions have inconsistent ratios, therefore I was curious which test slide has this issue.

udion commented 6 years ago

If I follow the readme as of now exactly,

test_026 gives the error as shown above.

yil8 commented 6 years ago

@udion I just ran the command again:

python NCRF/wsi/bin/tissue_mask.py /WSI_PATH/Test_026.tif /MASK_PATH/Test_026.npy

It produced a tissue mask in numpy format of shape (1536, 1624), while in your error message, your tissue mask shape seems to be (1536, 2048). I also checked the level_dimensions of Test_026.tif, which showed:

In [2]: slide = openslide.OpenSlide('Test_026.tif')

In [3]: slide.level_dimensions
Out[3]: 
((98304, 103936),
 (49152, 51968),
 (24576, 25984),
 (12288, 12992),
 (6144, 6496),
 (3072, 3248),
 (1536, 1624),
 (768, 812),
 (384, 406))

where level 6 dimension is indeed (1536, 1624). Not sure why you obtained a shape of (1536, 2048) on Test_026.tif at level 6?

sketchplanet commented 6 years ago

@yil8 The slide does not keep a consistent ratio occurs in Training dataset(almost all ). Tumor026: (97280, 220672) (48640, 110592) (24576, 55296) (12288, 27648) (6144, 13824) (3072, 7168) (1536, 3584) (1024, 2048) (512, 1024) (512, 512) Here is my solution in tisuue_mask.py:

    slide = openslide.OpenSlide(args.wsi_path)
    dims_tuple =  (int(slide.level_dimensions[0][0]/(2**args.level)), int(slide.level_dimensions[0][1]/(2**args.level)))
    #print(dims_tuple)
    # note the shape of img_RGB is the transpose of slide.level_dimensions
    img_RGB = np.transpose(np.array(slide.read_region((0, 0),
                           args.level,
                           slide.level_dimensions[args.level]).convert('RGB')),
                           axes=[1, 0, 2])
    #print((img_RGB.shape))
    img_RGB = cv2.resize(img_RGB,(dims_tuple[1],dims_tuple[0]))
yil8 commented 6 years ago

@sketchplanet my level_dimensions of Tumor_026.tif is quite different than yours:

In [2]: slide = openslide.OpenSlide('./Tumor_026.tif')

In [3]: slide.level_dimensions
Out[3]: 
((97280, 220672),
 (48640, 110336),
 (24320, 55168),
 (12160, 27584),
 (6080, 13792),
 (3040, 6896),
 (1520, 3448),
 (760, 1724),
 (380, 862),
 (190, 431))

And by definition, different level dimensions should be different in size of multiplies of 2, like in my case but not in your case. Not sure what happened in your case?

udion commented 6 years ago

maybe it has something to do witht the source of data?

I downloaded using the google drive link, and maybe @yil8, you are using baidu pan?

yil8 commented 6 years ago

@udion I was using the data from Google drive. BTW, there seems to be a application free download link from GigaDB, and I just downloaded test_026.tif, which gave me the correct level_dimensions:

In [1]: import openslide

In [2]: slide = openslide.OpenSlide('./test_026.tif')

In [3]: slide.level_dimensions
Out[3]: 
((98304, 103936),
 (49152, 51968),
 (24576, 25984),
 (12288, 12992),
 (6144, 6496),
 (3072, 3248),
 (1536, 1624),
 (768, 812),
 (384, 406))
udion commented 6 years ago

@sketchplanet, @yil8 I got it to work by splitting the tif files and then processing level 6 files (and specifying --level 0), anything wrong there?

yil8 commented 6 years ago

@udion maybe the splitting part is causing the trouble? I have never split the tif file and actually I don't even know how to split... Can you try to process the whole tif file at level 6?

meetps commented 6 years ago

I'm splitting the file using tiffsplit which created one tif file for each level:

tiffsplit <input_multilevel_tiff>
udion commented 6 years ago

@yil8 I downloaded the sample from gigdb link you shared. and again ran the following commands

python ./wsi/bin/tissue_mask.py ./sample_run/test_026.tif ./sample_run/test_026.npy

and then

python ./wsi/bin/probs_map.py ./sample_run/test_026.tif ./ckpt/resnet18_crf.ckpt ./configs/resnet18_crf.json ./sample_run/test_026_mask.npy ./sample_run/test_026_pmap.npy

I again got the following error:

Traceback (most recent call last):
  File "./wsi/bin/probs_map.py", line 163, in <module>
    main()
  File "./wsi/bin/probs_map.py", line 159, in main
    run(args)
  File "./wsi/bin/probs_map.py", line 115, in run
    args, cfg, flip='NONE', rotate='NONE')
  File "./wsi/bin/probs_map.py", line 87, in make_dataloader
    flip=flip, rotate=rotate),
  File "/media/udion/a2c5c487-f939-4b82-a348-86b3d1bdb024/udion_home/Projects/NCRF/wsi/bin/../../wsi/data/wsi_producer.py", line 42, in __init__
    self._preprocess()
  File "/media/udion/a2c5c487-f939-4b82-a348-86b3d1bdb024/udion_home/Projects/NCRF/wsi/bin/../../wsi/data/wsi_producer.py", line 55, in _preprocess
    .format(X_slide, X_mask, Y_slide, Y_mask))
Exception: Slide/Mask dimension does not match , X_slide / X_mask : 98304 / 1536, Y_slide / Y_mask : 103936 / 2048

I am using pytorch 0.3.1 and python 3.6, and used reuirement.txt to set up the environment? are you following the same steps as written in the readme?

(also the way I got it to run by splitting tif file as @meetshah1995 pointed, the output probability maps are not s great, as pointed out in another issue)

suggestions?

yil8 commented 6 years ago

@udion Before you run tissue_mask.py and probs_map.py, can you simply check the level dimensions of test_026.tif is the same as mine without any preprocessing, like splitting?

In [1]: import openslide

In [2]: slide = openslide.OpenSlide('./test_026.tif')

In [3]: slide.level_dimensions
Out[3]: 
((98304, 103936),
 (49152, 51968),
 (24576, 25984),
 (12288, 12992),
 (6144, 6496),
 (3072, 3248),
 (1536, 1624),
 (768, 812),
 (384, 406))

It just seems to me that the level dimensions of your test_026.tif are different than mine. If that's the case, then tissue_mask.py and probs_map.py won't make any sense.

meetps commented 6 years ago

@yil8 I downloaded the test_026.tif from the GigaDB link

I get this without any preprocessing:

In [1]: import openslide

In [2]: slide = openslide.OpenSlide('./test_026.tif')

In [3]: slide.level_dimensions
Out[3]: 
((98304, 103936),
 (49152, 52224),
 (24576, 26112),
 (12288, 13312),
 (6144, 6656),
 (3072, 3584),
 (1536, 2048),
 (1024, 1024),
 (512, 512))
udion commented 6 years ago

@yil8 @meetshah1995 I got the same output as meet's with test_026.tif (from Giga db link)

udion commented 6 years ago

@yil8 I am using resnet18_crf, that's fine right?

yil8 commented 6 years ago

@udion @meetshah1995 I still haven't figured out why you two obtained different level dimensions than mine. By definition, openslide pyramid levels should always be different by a factor of 2 between adjacent levels, and the Evaluation_FROC.py released by the Camelyon16 organizers used this feature in the code . My level dimensions (and confirmed by another users) follow this rule but not yours, therefore I tend to guess there may be some potential error on your side? Would you mind post which openslide version you guys used just for a sanity check? Unless the level_dimension issue is fixed, I don't think any downstream analysis is meaningful

udion commented 6 years ago

@yil8 I made a new environment (with python=3.6) and installed everything using requirements.txt (except pytorch, which I installed manually using pip install http://download.pytorch.org/whl/cu90/torch-0.3.1-cp36-cp36m-linux_x86_64.whl)

so my openslide version is 1.1.0

I am still getting the dimensions which @meetshah1995 has posted above.

yil8 commented 6 years ago

@udion @meetshah1995 as inspired by another issue #7 , I wondering if you could also plot test_026.tif at level 6 to see if there is also white margin there?

In [1]: import openslide

In [2]: slide = openslide.OpenSlide('./test_026.tif')

In [3]: slide.read_region((0, 0), 6, (slide.level_dimensions[6])).show()

My plot of test_026.tif at level 6 looks like this

screen shot 2018-07-05 at 10 24 31 am
meetps commented 6 years ago

@yil8 - Thanks for your prompt response. I tried plotting test_026.tif at level 6. My plot has a considerable white margin at the bottom.

In [1]: import openslide

In [2]: slide = openslide.OpenSlide('./test_026.tif')

In [3]: slide.read_region((0, 0), 6, (slide.level_dimensions[6])).show()

test_026_level6

Since openslide-python is a wrapper to the original OpenSlide binary, could you post the binary version you're using?

openslide-show-properties --version

openslide-show-properties 3.4.0, using OpenSlide 3.4.0
Copyright (C) 2007-2014 Carnegie Mellon University and others
yil8 commented 6 years ago

@meetshah1995 my binary version:

openslide-show-properties --version
openslide-show-properties 3.4.1, using OpenSlide 3.4.1
Copyright (C) 2007-2015 Carnegie Mellon University and others

OpenSlide is free software: you can redistribute it and/or modify it under
the terms of the GNU Lesser General Public License, version 2.1.
<http://gnu.org/licenses/lgpl-2.1.html>

OpenSlide comes with NO WARRANTY, to the extent permitted by law.  See the
GNU Lesser General Public License for more details.
meetps commented 6 years ago

@yil8 @udion

I upgraded my OpenSlide binary to 3.4.1 and the slides have the same dimension as mentioned by @yil8:

In [1]: import openslide

In [2]: slide = openslide.OpenSlide('./test_026.tif')

In [3]: slide.level_dimensions
Out[3]: 
((98304, 103936),
 (49152, 51968),
 (24576, 25984),
 (12288, 12992),
 (6144, 6496),
 (3072, 3248),
 (1536, 1624),
 (768, 812),
 (384, 406))
yil8 commented 6 years ago

@meetshah1995 Thanks so much for spotting this potential issue! I'll add this into README.

udion commented 6 years ago

thanks!

wuyi1983 commented 6 years ago

The same issue happened to me. I installed the openslide 3.4.0 on Ubuntu by using sudo apt-get install libopenslide-dev. How could I install 3.4.1 or update current version? Thanks.

yil8 commented 6 years ago

@wuyi1983 can you go to https://openslide.org/download/ and find 3.4.1 ?