mims-harvard / TDC

Therapeutics Commons (TDC-2): Multimodal Foundation for Therapeutic Science
https://tdcommons.ai
MIT License
984 stars 173 forks source link

New Dataset for Existing Task #271

Closed haneul-park closed 4 months ago

haneul-park commented 4 months ago

New dataset description: Human/Rat Liver Microsomal Stability (HLM_RLM)

Describe the problem We believe that the HLM_RLM datasets would be beneficially added under the ADME task. The HLM_RLM datasets predict the metabolic stability of compounds in Human and Rat Liver Microsomes, which is crucial for early-stage drug development. The datasets include 6,013 compounds for human liver microsomes and 5,590 for rat liver microsomes. Compounds are classified as stable or unstable based on their half-life. We have sanitized and organized the datasets, derived from a published paper(DOI: 10.1021/acs.chemrestox.2c00207), into a format with three columns: ID, X, and Y.

Describe the solution you’d like from tdc.single_pred import ADME data = ADME(name = ‘hlm’) data = ADME(name = ‘rlm’)

amva13 commented 4 months ago

@haneul-park please make sure tests pass. also, looks like you have a merge conflict. also, can you provide remaining information for this license(s) as per dataset descriptions on the website.

Please fill out a template following the format shown here. Looks like you have most already so should not be difficult.

Screenshot 2024-05-23 at 7 00 45 PM

Last, make sure to follow the instructions here https://github.com/mims-harvard/TDC/blob/main/CONTRIBUTE.md#1-new-dataset-for-existing-task

is this data in harvard dataverse already?

haneul-park commented 4 months ago

Thank you for your feedback. I have resolved the issues and filled in the missing information. Additionally, I have attached a zip file here containing ‘data processing scripts’ and ‘text description’.

Please let me know if anything further needs to be added or changed.

HLM_RLM.zip

amva13 commented 4 months ago

Hi @haneul-park , thanks for the contribution!