polaris-hub / polaris

Foster the development of impactful AI models in drug discovery.
https://polaris-hub.github.io/polaris/
Apache License 2.0
92 stars 6 forks source link

Error when loading data #202

Closed ranaabarghout closed 1 month ago

ranaabarghout commented 1 month ago

Polaris version

0.8.4

Python Version

3.10.12

Operating System

Colab

Installation

!pip install polaris-lib

Description

Hello! Wanted to play around with the drewry2017-pkis2-subset-v2 dataset but got an error when trying to import into my colab notebook.

Got the following error:

 ⠋ Fetching artifact...2024-09-18 18:47:10.305 | INFO     | polaris._artifact:_validate_version:66 - The version of Polaris that was used to create the artifact (dev) is different from the currently installed version of Polaris (0.8.4).
 💥 ERROR: Failed to fetch dataset. 
 /usr/local/lib/python3.10/dist-packages/yaspin/core.py:228: UserWarning: color, on_color and attrs are not supported when running in jupyter
   self._color = self._set_color(value) if value else value
 ---------------------------------------------------------------------------
 KeyError                                  Traceback (most recent call last)
 [<ipython-input-6-07eb79fba142>](https://localhost:8080/#) in <cell line: 2>()
       1 # Load the dataset from the Hub
 ----> 2 dataset = po.load_dataset("polaris/drewry2017-pkis2-subset-v2")
       3 
       4 # Get information on the dataset size
       5 dataset.size()

 4 frames
 [/usr/local/lib/python3.10/dist-packages/polaris/loader/load.py](https://localhost:8080/#) in load_dataset(path, verify_checksum)
      38         # Load from the Hub
      39         client = PolarisHubClient()
 ---> 40         return client.get_dataset(*path.split("/"), verify_checksum=verify_checksum)
      41 
      42     # Load from local file

 [/usr/local/lib/python3.10/dist-packages/polaris/hub/client.py](https://localhost:8080/#) in get_dataset(self, owner, name, verify_checksum)
     309             A `Dataset` instance, if it exists.
     310         """
 --> 311         return self._get_dataset(owner, name, ArtifactSubtype.STANDARD.value, verify_checksum)
     312 
     313     def _get_dataset(

 [/usr/local/lib/python3.10/dist-packages/polaris/hub/client.py](https://localhost:8080/#) in _get_dataset(self, owner, name, artifact_type, verify_checksum)
     361                 md5Sum = response["maskedMd5Sum"]
     362             else:
 --> 363                 dataset = Dataset(**response)
     364                 md5Sum = response["md5Sum"]
     365 

     [... skipping hidden 1 frame]

 [/usr/local/lib/python3.10/dist-packages/polaris/dataset/_column.py](https://localhost:8080/#) in _validate_content_type(cls, v, values)
      62         """Tries to convert a string to the Enum"""
      63         if isinstance(v, str):
 ---> 64             v = KnownContentType[v.upper()]
      65         return v
      66 

 [/usr/lib/python3.10/enum.py](https://localhost:8080/#) in __getitem__(cls, name)
     438 
     439     def __getitem__(cls, name):
 --> 440         return cls._member_map_[name]
     441 
     442     def __iter__(cls):

 KeyError: 'CHEMICAL/X-SMILES'

Steps to reproduce

import polaris as po from polaris.hub.client import PolarisHubClient client = PolarisHubClient() client.login() dataset = po.load_dataset("polaris/drewry2017-pkis2-subset-v2")

Additional output

Used !polaris login prior to the code provided

cwognum commented 1 month ago

Hey @ranaabarghout ! Thanks for reporting.

This seems to be fixed on main, but the fix hasn't been released. I just released 0.8.5. It should be available on PyPi soon. Let me know if that fixes it!

Please note that this latest version unfortunately won't be available on Conda yet. We're blocked by https://github.com/conda-forge/staged-recipes/pull/27246

ranaabarghout commented 1 month ago

Working now! Thanks @cwognum :)