STOmics / Stereopy

A toolkit of spatial transcriptomic analysis.
MIT License
197 stars 65 forks source link

Reading in custom .h5ad file, issues with adata.obsm['spatial'] - `ValueError: Unfamiliar encoding-type: dict.` #119

Closed mckellardw closed 1 year ago

mckellardw commented 1 year ago

I am trying to load in an h5ad file, where I have saved the spatial coordinates with the following code:

# Add spatial location
spatial_data = pd.read_csv(
    dnb_map, 
    sep="\t", 
    header=None, 
    names=["barcode", "x", "y"]
)

# Set the cell barcode as index
spatial_data.set_index("barcode", inplace=True)

# Add the spatial coordinates to the AnnData object
spatial_coord = spatial_data.loc[adata.obs_names, ['x', 'y']]
adata.obsm['spatial'] = spatial_coord.to_numpy()

When I read in the h5ad file with stereopy, I get an error saying ValueError: Unfamiliar 'encoding-type': dict.

data = st.io.read_ann_h5ad(
    file_path="data/output.h5ad",
    spatial_key='spatial', 
    bin_size=100
)

Full error:

ValueError                                Traceback (most recent call last)
Cell In[5], line 1
----> 1 data = st.io.read_ann_h5ad(
      2     file_path="data/output.h5ad",
      3     spatial_key='spatial', 
      4     bin_size=100
      5 )

File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/stereo/utils/read_write_utils.py:37, in ReadWriteUtils.check_file_exists.<locals>.wrapped(*args, **kwargs)
     35     else:
     36         raise FileNotFoundError("Please ensure there is a file")
---> 37 return func(*args, **kwargs)

File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/stereo/io/reader.py:485, in read_ann_h5ad(file_path, spatial_key, bin_type, bin_size)
    483 if spatial_key is not None:
    484     if isinstance(f[k], h5py.Group):
--> 485         data.position = h5ad.read_group(f[k])[spatial_key]
    486     else:
    487         data.position = h5ad.read_dataset(f[k])[spatial_key]

File ~/miniconda3/envs/stereopy/lib/python3.8/site-packages/stereo/io/h5ad.py:357, in read_group(group)
    355     return read_neighbors(group)
    356 else:
--> 357     raise ValueError(f'Unfamiliar `encoding-type`: {encoding_type}.')
    358 d = dict()
    359 for sub_key, sub_value in group.items():

ValueError: Unfamiliar `encoding-type`: dict.

This appears to be caused by an issue in stereopy.io.h5ad where read_group does not expect the encoding type of adata.obsm to be a dict (https://github.com/BGIResearch/stereopy/blob/main/stereo/io/h5ad.py#L342). Looking at other h5ad files output from the SAW pipeline, it appears that the obsm key is expected to have an encoding type of None.

How can I add the spatial coordinates to the anndata object, without changing the encoding type of adata.obsm?

TheSallyGardens commented 1 year ago

@mckellardw Could you provide the h5ad file for us to investigate?

mckellardw commented 1 year ago

Here is a downsampled dataset containing 1000 DNBs. sub1000.zip

Thanks!

LyonLen commented 1 year ago

@mckellardw Hi! Thanks for your bug report. I browsed the source code of AnnData both 0.7.5 (which is specified by Stereopy and SAW) and 0.8.0 (I guess you are using this one). There are differences between the two versions. v0.8.0 is using dict to reformat the anndata.obsm v0.7.5 is not, that's why the SAW h5ad's obsm key encoding-type appears to be None Maybe you could try using anndata==0.7.5 to create your h5ad? Hope this can help you solve the problem!

LyonLen commented 1 year ago

I discuss this bug with my colleagues and make an conclusion that we will fix this in the function read_ann_h5ad to adapt to both 0.7.5 and 0.8.0. Later you can try our v0.13.0. Thanks for your bug report again! @mckellardw

mckellardw commented 1 year ago

Thanks very much! I was actually using anndata==0.9.0, but reverting to anndata==0.7.5 worked.