MolecularAI / aizynthfinder

A tool for retrosynthetic planning
https://molecularai.github.io/aizynthfinder/
MIT License
562 stars 128 forks source link

Unable to open/create file hdf5 files #131

Closed MachineGUN001 closed 11 months ago

MachineGUN001 commented 12 months ago

sorry to bother you guys with a naive issue.

After downloading the models via dowload_public_data.py, the config.yml was generated successfully.

policy:
  files:
    uspto:
      - D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_model.onnx
      - D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_templates.csv.gz
    ringbreaker:
      - D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_ringbreaker_model.onnx
      - D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_ringbreaker_templates.csv.gz     
filter:
  files:
    uspto: D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_filter_model.onnx
stock:
  files:
    zinc: D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\zinc_stock.hdf5

when I tried to run the GUI through a Jupyter notebook as instructuion ,

from aizynthfinder.interfaces import AiZynthApp
# configfile=r"D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\interfaces\config_local.yml"
configfile=r"D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\config.yml"
app = AiZynthApp(configfile)

however, the error happened as below:

Loading template-based expansion policy model from D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_model.onnx to uspto
Loading templates from D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_templates.csv.gz to uspto
Loading template-based expansion policy model from D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_ringbreaker_model.onnx to ringbreaker
Loading templates from D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_ringbreaker_templates.csv.gz to ringbreaker
Loading filter policy model from D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\uspto_filter_model.onnx to uspto
Loading stock from D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\zinc_stock.hdf5 to zinc
---------------------------------------------------------------------------
HDF5ExtError                              Traceback (most recent call last)
Cell In[2], line 4
      2 configfile=r"D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\interfaces\config_local.yml"
      3 configfile=r"D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\config.yml"
----> 4 app = AiZynthApp(configfile)

File ~\anaconda3\envs\aisynth\lib\site-packages\aizynthfinder\interfaces\aizynthapp.py:60, in AiZynthApp.__init__(self, configfile, setup)
     57 def __init__(self, configfile: str, setup: bool = True) -> None:
     58     # pylint: disable=used-before-assignment
     59     setup_logger(logging.INFO)
---> 60     self.finder = AiZynthFinder(configfile=configfile)
     61     self._input: StrDict = dict()
     62     self._output: StrDict = dict()

File ~\anaconda3\envs\aisynth\lib\site-packages\aizynthfinder\aizynthfinder.py:67, in AiZynthFinder.__init__(self, configfile, configdict)
     64 self._logger = logger()
     66 if configfile:
---> 67     self.config = Configuration.from_file(configfile)
     68 elif configdict:
     69     self.config = Configuration.from_dict(configdict)

File ~\anaconda3\envs\aisynth\lib\site-packages\aizynthfinder\context\config.py:129, in Configuration.from_file(cls, filename)
    127     txt = txt.replace(item, os.environ[item[2:-1]])
    128 _config = yaml.load(txt, Loader=yaml.SafeLoader)
--> 129 return Configuration.from_dict(_config)

File ~\anaconda3\envs\aisynth\lib\site-packages\aizynthfinder\context\config.py:100, in Configuration.from_dict(cls, source)
     98 config_obj.expansion_policy.load_from_config(**src_copy.get("policy", {}))
     99 config_obj.filter_policy.load_from_config(**src_copy.get("filter", {}))
--> 100 config_obj.stock.load_from_config(**src_copy.get("stock", {}))
    101 config_obj.scorers.load_from_config(**src_copy.get("scorer", {}))
    102 config_obj.molecule_cost.load_from_config(**src_copy.get("molecule_cost", {}))

File ~\anaconda3\envs\aisynth\lib\site-packages\aizynthfinder\context\stock\stock.py:170, in Stock.load_from_config(self, **config)
    167     self.set_stop_criteria(config["stop_criteria"])
    169 for key, stockfile in config.get("files", {}).items():
--> 170     self.load(stockfile, key)
    172 if "mongodb" in config:
    173     query_obj = MongoDbInchiKeyQuery(**(config["mongodb"] or {}))

File ~\anaconda3\envs\aisynth\lib\site-packages\aizynthfinder\context\stock\stock.py:150, in Stock.load(self, source, key)
    147 self._logger.info(f"Loading stock from {src_str} to {key}")
    149 if isinstance(source, str):
--> 150     source = InMemoryInchiKeyQuery(source)
    151 self._items[key] = source

File ~\anaconda3\envs\aisynth\lib\site-packages\aizynthfinder\context\stock\queries.py:98, in InMemoryInchiKeyQuery.__init__(self, filename)
     96 ext = os.path.splitext(filename)[1]
     97 if ext in [".h5", ".hdf5"]:
---> 98     stock = pd.read_hdf(filename, key="table")  # type: ignore
     99     inchis = stock.inchi_key.values  # type: ignore
    100 elif ext == ".csv":

File ~\AppData\Roaming\Python\Python39\site-packages\pandas\io\pytables.py:416, in read_hdf(path_or_buf, key, mode, errors, where, start, stop, columns, iterator, chunksize, **kwargs)
    413 if not exists:
    414     raise FileNotFoundError(f"File {path_or_buf} does not exist")
--> 416 store = HDFStore(path_or_buf, mode=mode, errors=errors, **kwargs)
    417 # can't auto open/close if we are using an iterator
    418 # so delegate to the iterator
    419 auto_close = True

File ~\AppData\Roaming\Python\Python39\site-packages\pandas\io\pytables.py:578, in HDFStore.__init__(self, path, mode, complevel, complib, fletcher32, **kwargs)
    576 self._fletcher32 = fletcher32
    577 self._filters = None
--> 578 self.open(mode=mode, **kwargs)

File ~\AppData\Roaming\Python\Python39\site-packages\pandas\io\pytables.py:737, in HDFStore.open(self, mode, **kwargs)
    731     msg = (
    732         "Cannot open HDF5 file, which is already opened, "
    733         "even in read-only mode."
    734     )
    735     raise ValueError(msg)
--> 737 self._handle = tables.open_file(self._path, self._mode, **kwargs)

File ~\anaconda3\envs\aisynth\lib\site-packages\tables\file.py:300, in open_file(filename, mode, title, root_uep, filters, **kwargs)
    295             raise ValueError(
    296                 "The file '%s' is already opened.  Please "
    297                 "close it before reopening in write mode." % filename)
    299 # Finally, create the File instance, and return it
--> 300 return File(filename, mode, title, root_uep, filters, **kwargs)

File ~\anaconda3\envs\aisynth\lib\site-packages\tables\file.py:750, in File.__init__(self, filename, mode, title, root_uep, filters, **kwargs)
    747 self.params = params
    749 # Now, it is time to initialize the File extension
--> 750 self._g_new(filename, mode, **params)
    752 # Check filters and set PyTables format version for new files.
    753 new = self._v_new

File ~\anaconda3\envs\aisynth\lib\site-packages\tables\hdf5extension.pyx:484, in tables.hdf5extension.File._g_new()

HDF5ExtError: HDF5 error back trace

  File "C:\ci\hdf5_1655191106204\work\src\H5F.c", line 620, in H5Fopen
    unable to open file
  File "C:\ci\hdf5_1655191106204\work\src\H5VLcallback.c", line 3502, in H5VL_file_open
    failed to iterate over available VOL connector plugins
  File "C:\ci\hdf5_1655191106204\work\src\H5PLpath.c", line 579, in H5PL__path_table_iterate
    can't iterate over plugins in plugin path '(null)'
  File "C:\ci\hdf5_1655191106204\work\src\H5PLpath.c", line 712, in H5PL__path_table_iterate_process_path
    can't open directory
  File "C:\ci\hdf5_1655191106204\work\src\H5VLcallback.c", line 3351, in H5VL__file_open
    open failed
  File "C:\ci\hdf5_1655191106204\work\src\H5VLnative_file.c", line 97, in H5VL__native_file_open
    unable to open file
  File "C:\ci\hdf5_1655191106204\work\src\H5Fint.c", line 1990, in H5F_open
    unable to read superblock
  File "C:\ci\hdf5_1655191106204\work\src\H5Fsuper.c", line 617, in H5F__super_read
    truncated file: eof = 122044800, sblock->base_addr = 0, stored_eof = 663232280

End of HDF5 error back trace

Unable to open/create file 'D:\Cheminfo_Workshop\7_Aisynth\aizynthfinder37\aizynthfinder\models\zinc_stock.hdf5'

could you please help me how to fix it up? my OS system is windows10, with python 3.9

many many thanks,

Shengyang

SGenheden commented 11 months ago

Hello From the error message it appears that HDF5 file containing the ZINC stock was somehow corrupted during download. Could you be so kind and try re-downloading it?

MachineGUN001 commented 11 months ago

Hello From the error message it appears that HDF5 file containing the ZINC stock was somehow corrupted during download. Could you be so kind and try re-downloading it?

thanks for your kind reply. I tried to re-download the HDF5 for ZINC stock, but still have problems with the same error to read it. however, the another format file NMX works and can be read by program. thus I remove the HDF5 of ZINC stock from the config yml and then the aizynth can go without errors.

SGenheden commented 11 months ago

You would need the ZINC stock (or another stock) to perform the retrosynthesis experiments. So I am afraid to you would need to find another solution than just removing the reference in the config file.

MachineGUN001 commented 11 months ago

yep, I can implement the retrosynthesis without ZINC stock after removing the hdf5 files from config file. anyway, I'll keep trying to figure out another way or redownload the model files (hdf5 format) and see if it works or not. for now I close the issues. and thank you again for your kind explanation and help.