Closed russelljeffrey closed 10 months ago
Hi @russelljeffrey
As you might have read in our documentation, the TAGS are predetermined so you need to abide by them, otherwise the model cannot be approved as the checks will fail. Please check them out and modify accordingly before we can proceed.
Hello @GemmaTuron. Thank you for mentioning that. I have read the docs before and as shown here in an example, the tag is written based on the name of the model (As I understood). I read most parts of the Ersilia gitbook that is in regards to model contribution. Can you please provide a link (if any available) so that I can understand how they are predefined? Thank you.
Hello @russelljeffrey
The Slug is the model short name, not the tag. The one you suggest feels too long, so I would only write smiles-tokenizer All the information related to how to fill in these fields can be found in the Ersilia GitBook under model contribution
Thank you @GemmaTuron for your explanations. All of the required modification will be done very soon.
Hi again @GemmaTuron . The slug and tag have been modified according to the link you provided.
Hello @russelljeffrey The slug is the short name and the tags are the key words that will help identify the model, and they must be from the list provided in the above link, please have a look and suggest a few that might fit.
Hi @GemmaTuron , The tag and Slug were corrected.
I have updated the fields according to the guidelines in the documentation (i.e, description must be at least 200 characters, and slug is should not have caps). I have removed the hERG Tag, why did you choose it @russelljeffrey? Before starting the model incorporation, can you confirm that you are able to run Ersilia models in your system?
Thanks
Hi @GemmaTuron . The reason I chose hERG in the tag was that authors used hERG in the article as one of the benchmark datasets, and thank you for updating the discriptions. Yes ersilia can fully function on my system since I use a dedicated GitHub codespace.
/approve
@russelljeffrey ersilia model respository has been successfully created and is available at:
Now that your new model respository has been created, you are ready to start contributing to it!
Here are some brief starter steps for contributing to your new model repository:
Note: Many of the bullet points below will have extra links if this is your first time contributing to a GitHub repository
README.md
file to accurately describe your modelIf you have any questions, please feel free to open an issue and get support from the community!
Hi @russelljeffrey
Are you still active on this model? Let me know the status and add here if you have any comments
Hello @GemmaTuron . Yes I'm still active and I believe the implementation will be finished until the end of this week. The implementation of tokenizer is complete but testing is still needed. So I will soon test the model to make sure it's ready to be used.
This model is completed!
Model Name
SmilesPE: tokenizer algorithm for SMILES, DeepSMILES, and SELFIES
Model Description
The Smiles Pair Encoding method generates smiles substring tokens based on high-frequency token pairs from large chemical datasets. This method is well-suited for both QSAR activities as well as generative models. The model provided here has been pretrained using ChEMBL.
Slug
smiles-pe
Tag
Chemical language model, Chemical notation, ChEMBL
Publication
https://pubs.acs.org/doi/abs/10.1021/acs.jcim.0c01127
Source Code
https://github.com/XinhaoLi74/SmilesPE
License
Apache-2.0