chao1224 / MoleculeSTM

Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
https://chao1224.github.io/MoleculeSTM
Other
188 stars 18 forks source link

How do I get the latest "CID2SMILES.csv" data file #8

Closed Greay83 closed 6 months ago

Greay83 commented 7 months ago

Hi, this is a very cool work, thanks for sharing this repository.

The code on github does not generate the latest "CID2SMILES.csv" file, except for an earlier version uploaded on huggingface. How do I get the latest version of this file?

Thank you in advance :)

chao1224 commented 6 months ago

Hi @Greay83,

You can obtain this by mapping the CID and SMILES using the downloaded and preprocessed files under this folder.

Yet, in case you need it, I just uploaded my local script to the main branch (same folder with the name step_05). Feel free to check.

Greay83 commented 6 months ago

Thank you for your notice! Sincerely.