Closed huangtinglin closed 3 weeks ago
Hi @huangtinglin,
The patches/*.h5
files only contain patches under tissue in order to save storage space.
The patches/*.h5
files from the benchmark were generated using an older version of the tissue segmenter (see below).
For consistency with our paper, use the data from hest-bench when benchmarking. For an improved tissue segmentation when training your model, prefer hest.
@huangtinglin, small addition: the tissue segmentation (i.e., where the tissue is) is in green; but patching is only done on regions where transcripts were measured, which explain that not all tissue regions have a patch.
Thanks for the clarification! That solves my problem. Do you guys plan to update the benchmark based on the updated data?
We may update if we add samples in the benchmark. For the sake of simplicity and consistency, we will keep it this way for now. This said, you are welcome to use the updated samples in your own study.
I found that the number of spots for most samples in HEST-bench differs from the corresponding samples in HEST-1k. Take
TENX141.h5
as an example which is included in LUNG task:Is this because the data has been updated? Which one should be taken for benchmarking?