mahmoodlab / HEST

HEST: Bringing Spatial Transcriptomics and Histopathology together - NeurIPS 2024
Other
164 stars 12 forks source link

How large is the spot in Xenium data and ST data? #39

Closed kennethahah closed 3 months ago

kennethahah commented 3 months ago

Sorry to ask again on the .h5ad file in the folder st.

The indices of the AnnData object, I assume, are barcodes. The AnnData object records the gene counts of each gene in a spot with the corresponding barcode.

For Visium data, I can understand that the spot is a 55um radius circle.

How to understand Xenium and ST data? The real Xenium data has no spot. Do you just bin the gene counts in a spot? If so, what's the radius of the spot you use? The metadata in "hf://datasets/MahmoodLab/hest/HEST_v1_0_2.csv" says the spot diameter is NaN.

For the ST data, the real spot has radius 100um. But since you have 224 x 224 patches at resolution of 0.5um/px (so 112um x 112um), can I understand that the expression data in, for example SPA0.h5ad, covers a spot larger than the 224 x 224 patches in the patch file SPA0.h5?

pauldoucet commented 3 months ago

Hi @kennethahah,

For ST samples feel free to adjust the size of patches by generating your own patches:

sts = load_hest('hest_data', id_list=['TENX95', 'TENX99'])
for st in sts:
    st.dump_patches('patch_dir', target_patch_size=224, target_pixel_size=0.5)

Note that we are currently working on providing the position of each transcript for Xenium (hence allow custom pooling) and an additional pooling of transcripts per cell on HuggingFace.

kennethahah commented 3 months ago

Thanks @pauldoucet.

So in summary, Visium and ST data have the real spot. Xenium and VisiumHD have pseudo spot.

The size of the real spot depends on techonology (55um for Visium and 100um for ST) The size of the pseudo spot is always 100um x 100um.

Is this correct?

guillaumejaume commented 3 months ago

That's correct

kennethahah commented 3 months ago

I'll close the issue. Thanks for clarifying the details.