Closed XinyangHan closed 11 months ago
Hi @XinyangHan - sharing my answer publicly as well for others to see.
Regarding this point, we do not extract HIPT features with batch sizes greater than 1 (which you may have been doing). Rather, we patch + extract features from $4096^2$ px images with a batch size == 1, followed by reshaping the $4096^2$ px image into $256 \times 256 \times 256 \times 3$ (effectively a minibatch of 256 images of size $256^2$ px). From my experimentation in extracting features from HIPT in this manner, this should have slightly slower but similar computing time as 256-sized feature extraction. To sanity check - TCGA-3C-AALK-01Z-00-DX1.4E6EB156-BB19-410F-878F-FC0EA7BD0B53
with 4K patching yields ~117 4K patches in this slide and takes around 190 seconds (see the provided segmentation below).
One of the limitations of this work is that 4K patching for some slide can be difficult using the four_pt
contour function in CLAM, which was developed for $256^2$ px images. Thus, certain cohorts (as described in the README) exclude WSIs. To make HIPT comparable with splits used by other papers, I would suggest trying to modify the four_pt
contour function to be less conservative and make sure that it at least includes 1 4K region per slide. For example, a biopsy slide (like the one shown below) may not be patched correctly and have any detected 4K patches using the current four_pt
contour function (there are many slides in TCGA that are just biopsy fragments and have very little tissue content). Using the exact splits that I evaluate is also OK, but if you are also evaluating on your own splits or custom dataset, I would try and relax the contour function.
Thanks for your excellent work. I’m about to replicate your weakly-supervised experiment and assume I need to start with feature extraction using hipt4k pretrained model. In my initial tests, extracting ViT features seems lengthy, possibly taking over a week. Could you share how long this step took in your work, and the setup you used?
Hi! I am also reproducing this experiment. Would you want to discuss some results/progress?
Yeah! Sure! Glad to chat!
Yeah! Sure! Glad to chat!
Cool! Do you have an email address maybe? (/ my email is written on my profile)
Thanks for your excellent work. I’m about to replicate your weakly-supervised experiment and assume I need to start with feature extraction using hipt4k pretrained model. In my initial tests, extracting ViT features seems lengthy, possibly taking over a week. Could you share how long this step took in your work, and the setup you used?