Closed Femme-js closed 1 year ago
DrugTax leverages small molecule representations as input in the form of SMILES.
The package allows extraction of taxonomy information and key features of molecules for detailed characterization. It also allows to leverage the visualization and bulk analysis of molecules for chemical space representation and molecule similarity assessment.
DrugTax provides the prior classification between the two possible kingdoms, organic and inorganic, and, respectively, their 26 and 5 superclasses.
This package could be applied to generate similarity searches, chemical space visualization, clustering, taxonomy-property relationships, among others. The results could then be combined with different easy-to-implement visualization tools.
This package comes with very few dependencies. Most of its extended dependencies emerge when using the bulk analysis and plotting options.
https://colab.research.google.com/drive/1WMTqL2YLxyY3baa-OxnwhNG0caTpOatj?usp=sharing
This is the colab link to get started with DrugTax.
DrugTax package has a drug tax class to extract taxonomy information for a smile and 163 features (simple and explainable). If one wants to get taxonomy information for bulk data with input in the form of a CSV file or drug list or smile list, there is 'retrieve_taxonomic_class' to use.
@miquelduranfrigola Can you take a look please?
/approve
@Femme-js ersilia model respository has been successfully created and is available at:
๐ ersilia-os/eos24ci
Now that your new model respository has been created, you are ready to start contributing to it!
Here are some brief starter steps for contributing to your new model repository:
Note: Many of the bullet points below will have extra links if this is your first time contributing to a GitHub repository
README.md
file to accurately describe your modelIf you have any questions, please feel free to open an issue and get support from the community!
Hi @GemmaTuron and @miquelduranfrigola !
The main.py code is working for small molecules.
I am attaching the sample output file from the code. output.csv output.csv
I tried inputting the smiles in eml_canonical.csv as single smile input, but strangely bulk analysis is giving the error when the all the inputs are in a single list from eml_canonical.csv.
Hi @Femme-js ,
As we just discussed:
Hi @Femme-js Can you provide an update of the model status and what did you find out about the smiles issue?
Timeline of incorporating this model:
While incorporating this model outside ersilia, I tested my code over the eml_canonical.csv file (standard inputs of SMILES, provided by ersilia during the contribution period) into CSV format, smiles_list, and drug_list. While testing it out, I did encounter the above-posted error with the inputs in standard SMILE format. After debugging it through above discussed points with @GemmaTuron, I found out that drugtax module does not parse the smiles into the aromatic format, and needs to be converted into Kekule format. For this, I used rdkit package to convert the aromatic smiles input into Kekule input.
Below is the description of Kekule and the Aromatic format of SMILES :
Current Status of the Model:
The PR for this model has already been merged and is ready to test.
I tried and tested this model on CLI but it fails to fetch.
Thanks @Femme-js . This seems to be related to a conda installation error.
I've tried to fetch the model both in my local computer and in a github actions workflow. I found errors as well, but conda installation worked.
I haven't solved the model yet, but please check some edits I've done: https://github.com/ersilia-os/eos24ci/commit/359fa36bfd596c48d67ce41f7178c7fe3fceffbd
Hi @Femme-js the model was fetched successfully in my device after a few changes I've made. Please inspect them.
Before closing the issue, let's:
README.md
file.Many thanks!
Hi @Femme-js
Can I close this issue?
Yes @GemmaTuron !
Hi @Femme-js, the model was fetched successfully in my device after a few changes I've made. Please inspect them.
Before closing the issue, let's:
- [x] Try to have you model working on your device.
- [x] Complete the
README.md
file.Many thanks!
Hi @miquelduranfrigola !
I have been able to successfully test the model on my CLI. I am attaching the log file here. eos24ci3.log
Model Name
DrugTax
Model Description
DrugTax is a python package for drug taxonomy identification and explainable feature extraction
Slug
drugtax
Tags
taxonomy classification, bulk analysis
Publication
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-022-00649-w
Code
License
GNU General Public License v3.0
https://github.com/MoreiraLAB/DrugTax/blob/main/LICENSE