Closed jseager7 closed 2 months ago
Hi @jseager7, this sounds like a good plan.
I can email you the latest anti-infective list as this is not currently available on GitHub. Once I have finished checking and approving the block of chemistry curated papers and finished updating the anti-infective list with an further information we can then load the updated copy to GitHub (including a new column for the SMILES strings).
I was also thinking about the ChEBI ids. At the moment they are not displayed in PHI-base 5 and may also require to be added via AE.
the Chemistry specific AEs would then be alteration_in_archetype ChEBI_id FRAC_code SMILES
Also see past comment in older ticket https://github.com/PHI-base/curation/issues/20
Thanks. I can wait until the latest anti-infectives list is uploaded to GitHub.
It will be easy enough to include the ChEBI term ID as an annotation extension, especially because I'm already going to be querying ChEBI to get the SMILES strings.
Hi @jseager7, has this been done now? Shall I close the ticket?
Thanks for the reminder. I had to go back and check, but it looks like I added this functionality to the pipeline a few months ago. We haven't made any PHI-base 5 releases with the new extensions yet though. I'll close this issue anyway.
It was recently requested that we include SMILES strings for chemistry annotations in PHI-base.
For example, the SMILES string for deoxynivalenol (CHEBI:10022) is:
[H][C@@]12O[C@]3([H])C=C(C)C(=O)[C@@H](O)[C@]3(CO)[C@@](C)(C[C@H]1O)[C@]21CO1
We can retrieve these strings from the ChEBI ontology (from the 'smiles' annotation property), so there should be no need to curate them manually, but we do need to decide how they should be displayed.
The easiest solution is to add an annotation extension for the SMILES string which will be automatically generated for every chemistry annotation (resistance / sensitivity / normal) and added to the JSON export. We decided to do something similar for adding FRAC codes. The main benefit of this is that it doesn't (or shouldn't) require any changes to the logic of the PHI-base 5 website.
We may also want to add the SMILES strings to the anti-infective list on the PHI-base/data repository.
My first task is figuring out how to automatically extract the SMILES strings from the ChEBI ontology. I can probably make use of the existing anti-infectives list to get the full list of ChEBI terms that we need to map.
@CuzickA Does this plan sound okay to you?