mahmoodlab / HEST

HEST: Bringing Spatial Transcriptomics and Histopathology together - NeurIPS 2024
Other
161 stars 12 forks source link

Converting Hest data to Spatialdata #21

Closed SathiyaNManivannan closed 3 months ago

SathiyaNManivannan commented 3 months ago

Hi!,

Thank you for making this huge resource and python package. I downloaded hestdata from huggingface using the following command.

local_dir = {path_to_local_storage}
ids_to_query = ['TENX96', 'TENX99'] # list of ids to query

list_patterns = [f"*{id}[_.]**" for id in ids_to_query]
dataset = datasets.load_dataset(
    'MahmoodLab/hest', 
    cache_dir=local_dir,
    patterns=list_patterns
)

The download was successful.

When I tried to conver the hestData set into a Spatialdata object using the following command:

hest_data = hest.load_hest(local_dir, id_list=['TENX96'])
hest_data = hest_data[0]
spdata = hest_data.to_spatial_data()

I get the following error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_21396\3945162401.py in ?()
----> 1 spdata = hestData.to_spatial_data()

~\hest\src\hest\HESTData.py in ?(self, lazy_img)
    527             return pyvips.Image.tiffload(path).numpy().transpose((2, 0, 1))
    528 
    529         if lazy_img and not (isinstance(self.wsi, np.ndarray)):
    530 
--> 531             with tifffile.TiffFile(self.wsi) as tif:
    532                 page = tif.pages[0]
    533                 width = page.imagewidth
    534                 height = page.imagelength

~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\tifffile\tifffile.py in ?(self, file, mode, name, offset, size, omexml, _multifile, _useframes, _parent, **is_flags)
   4251                 raise ValueError('invalid OME-XML')
   4252             self._omexml = omexml
   4253             self.is_ome = True
   4254 
-> 4255         fh = FileHandle(file, mode=mode, name=name, offset=offset, size=size)
   4256         self._fh = fh
   4257         self._multifile = True if _multifile is None else bool(_multifile)
   4258         self._files = {fh.name: self}

~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\tifffile\tifffile.py in ?(self, file, mode, name, offset, size)
  14626         self._offset = -1 if offset is None else offset
  14627         self._size = -1 if size is None else size
  14628         self._close = True
  14629         self._lock = NullContext()
> 14630         self.open()
  14631         assert self._fh is not None

~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\tifffile\tifffile.py in ?(self)
  14717             except AttributeError:
  14718                 pass
  14719 
  14720         else:
> 14721             raise ValueError(
  14722                 'the first parameter must be a file name '
  14723                 'or seekable binary file object, '
  14724                 f'not {type(self._file)!r}'

ValueError: the first parameter must be a file name or seekable binary file object, not <class 'hest.wsi.OpenSlideWSI'>

When I tried with

spdata = hest_data.to_spatial_data(lazy_img = False)

The data is converted to spdata with a warning:

C:\Users\manivans\hest\src\hest\HESTData.py:543: DeprecationWarning: `table` is being deprecated as an argument to `SpatialData.__init__.__init__` in spatialdata version 0.1.0, switch to `tables` instead.
  st = SpatialData({'fullres': sp_img}, table=self.adata)

and when I check the spatialdata, I get an error:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
File ~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\IPython\core\formatters.py:711, in PlainTextFormatter.__call__(self, obj)
    704 stream = StringIO()
    705 printer = pretty.RepresentationPrinter(stream, self.verbose,
    706     self.max_width, self.newline,
    707     max_seq_length=self.max_seq_length,
    708     singleton_pprinters=self.singleton_printers,
    709     type_pprinters=self.type_printers,
    710     deferred_pprinters=self.deferred_printers)
--> 711 printer.pretty(obj)
    712 printer.flush()
    713 return stream.getvalue()

File ~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\IPython\lib\pretty.py:419, in RepresentationPrinter.pretty(self, obj)
    408                         return meth(obj, self, cycle)
    409                 if (
    410                     cls is not object
    411                     # check if cls defines __repr__
   (...)
    417                     and callable(_safe_getattr(cls, "__repr__", None))
    418                 ):
--> 419                     return _repr_pprint(obj, self, cycle)
    421     return _default_pprint(obj, self, cycle)
    422 finally:

File ~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\IPython\lib\pretty.py:787, in _repr_pprint(obj, p, cycle)
    785 """A pprint that just redirects to the normal repr function."""
    786 # Find newlines and replace them with p.break_()
--> 787 output = repr(obj)
    788 lines = output.splitlines()
    789 with p.group():

File ~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\spatialdata\_core\spatialdata.py:1717, in SpatialData.__repr__(self)
   1716 def __repr__(self) -> str:
-> 1717     return self._gen_repr()

File ~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\spatialdata\_core\spatialdata.py:1822, in SpatialData._gen_repr(self)
   1819 from spatialdata.transformations.operations import get_transformation
   1821 descr += "\nwith coordinate systems:\n"
-> 1822 coordinate_systems = self.coordinate_systems.copy()
   1823 coordinate_systems.sort(key=_natural_keys)
   1824 for i, cs in enumerate(coordinate_systems):

File ~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\spatialdata\_core\spatialdata.py:1695, in SpatialData.coordinate_systems(self)
   1693 gen = self._gen_spatial_element_values()
   1694 for obj in gen:
-> 1695     transformations = get_transformation(obj, get_all=True)
   1696     assert isinstance(transformations, dict)
   1697     for cs in transformations:

File ~\AppData\Local\miniconda3\envs\st_py_3_10\lib\site-packages\spatialdata\transformations\operations.py:121, in get_transformation(element, to_coordinate_system, get_all)
    118 from spatialdata.models._utils import DEFAULT_COORDINATE_SYSTEM
    120 transformations = _get_transformations(element)
--> 121 assert isinstance(transformations, dict)
    123 if get_all is False:
    124     if to_coordinate_system is None:

AssertionError: 

I am using python3.10 on Windows. Please help!

pauldoucet commented 3 months ago

Good morning,

Thank you very much for providing the solution to the problem. I just pushed an update, I made a few modifications to your code to use the hestdata.wsi object instead of a path. Let me know it the issue persists