Closed kennethahah closed 2 months ago
Hi @kennethahah,
1 - We apologize for the inconsistency, originally we extracted patches of 224x224 at 0.5um/px and rescaled those 224x224 patches to 256x256 in order to mimic the way foundation models like CTranspath were trained. However following your question, we re-uploaded the 224x224 patches on huggingface (see 2).
2 - The spot coordinates (in pixel on the high resolution image) are stored in the st.adata.obsm[‘spatial’]
of each sample as described here.
If interested in the coordinates of each extracted patch, please redownload the latest patches
directory from huggingface, we added the following .h5 assets in the latest version:
coords
: indicates the (x, y) pixel coordinates of the top left corner for each patch (in the full resolution image, note that TENX99.tif is 0.2125um/px not 0.5um/px)patch_size_src
: indicates the width/height of each patch before rescaling (hence at 0.2125um/px for TENX99)patch_size_target
: indicates the width/height of each patch after rescaling (0.5um/px in our case)Also please feel free to extract the patches at your preferred patch_size/resolution with dump_patches
(see the documentation here)
sts = load_hest('hest_data', id_list=['TENX95', 'TENX99'])
for st in sts:
st.dump_patches('patch_dir', target_patch_size=224, target_pixel_size=0.5)
Let us know if anything is missing
Hi @pauldoucet
Thanks for uploading the 224*224 patches.
In each of the .h5 files in the directory patches
, I can see three keys coords
, barcode
, and img
. However, I don't see the two other keys patch_size_src
and patch_size_target
.
The other keys are in the global attributes of the .h5, because they are common to all the patches:
f['img'].attrs.keys()
<KeysViewHDF5 ['downsample', 'patch_size_src', 'patch_size_target', 'pixel_size']>
Thanks @pauldoucet. These are all my questions. I'll close this issue.
Thanks for getting and aligning all data. I have two questions for the patches in each dataset.
On the HEST-1k paper, it says that patches are in the size of 224 x 224. However, when I downloaded the dataset TENX99, patches in the folder
patches/
are of size 256 x 256. Any particular reason for having a slightly larger patch?In the file
TENX99.h5
, it not only contains the images but also coordinates of spots. Is the image in the resolution 0.5um/px and the coordinates are in pixels? If so, then the WSI seems too large to be true because it has x coordinates greater than 50,000 and y coordinates greater than 100,000, which translates to a slide of size 25mm x 50 mm. It's not possible to fit it into a Visium machine. If not, then I guess the coordinates are in pixels but for a higher resolution WSI.