SynBioDex / SBOL-visual

The reference implementation of the SBOL Visual standard
Other
32 stars 16 forks source link

Ontology needs updating to track new SEPs #15

Closed jakebeal closed 4 years ago

jakebeal commented 6 years ago

The ontology files are out of date and need updating. We also need to have a README in the ontology directory that explains what all of the ontology files are for and what they mean.

jakebeal commented 6 years ago

Key information to be made machine-readable: symbol linkage to SO term(s).

dissys commented 5 years ago

I looked at the terms in these files. There are 27 SBOL terms defined. 21 of them refer to glyphs and the last 6 are the SBOL 1.0 terms. The terms for glyphs are linked to SO terms. I suggest recapturing these relationships using the SBOL data model, if possible, and removing any SBOL related terms from SBOL-Visual.

SBOL-Visual related terms: id: SBOL:0000001 name: promoter symbol

id: SBOL:0000002 name: assembly scar symbol

id: SBOL:0000003 name: operator symbol

id: SBOL:0000004 name: coding sequence symbol

id: SBOL:0000005 name: ribosome entry site symbol

id: SBOL:0000006 name: terminator symbol

id: SBOL:0000007 name: insulator symbol

id: SBOL:0000008 name: ribonuclease site symbol

id: SBOL:0000009 name: rna stability element symbol

id: SBOL:0000010 name: protease site symbol

id: SBOL:0000011 name: protein stability element symbol

id: SBOL:0000012 name: origin of replication symbol

id: SBOL:0000013 name: primer binding site symbol

id: SBOL:0000014 name: restriction enzyme recognition site symbol

id: SBOL:0000015 name: blunt restriction site symbol

id: SBOL:0000016 name: 5' sticky restriction site symbol

id: SBOL:0000017 name: 3' sticky restriction site symbol

id: SBOL:0000018 name: 5' overhang symbol

id: SBOL:0000019 name: 3' overhang symbol

id: SBOL:0000020 name: signature symbol

id: SBOL:0000021 name: user defined symbol

SBOL 1.0 Terms: id: SBOL:0000022 name: SBOL:Core Term

id: SBOL:0000023 name: DNAComponent

id: SBOL:0000024 name: Collection

id: SBOL:0000025 name: DNASequence

id: SBOL:0000026 name: SequenceAnnotation

id: SBOL:0000027 name: type

cjmyers commented 5 years ago

@jakebeal has this been done for 2.1?

jakebeal commented 5 years ago

There is a draft in @dissys 's project: https://github.com/dissys/sbol-visual-ontology, but it has not yet been incorporated.

goksel commented 5 years ago

The conversion uses Markdown files. Please let me know when the files are updated with the new terms and I can run the conversion again. There are currently four issues to be fixed in order to fully convert Markdown files into the SBOL Visual Ontology:

Issues remaining: Issue 1: Complex Can we change the order of the glyphs in https://github.com/SynBioDex/SBOL-visual/tree/master/Glyphs/FunctionalComponents/complex so that it starts with the generic glyph followed by three recommendations similar to location (https://github.com/SynBioDex/SBOL-visual/tree/master/Glyphs/location) and cleavege-site (https://github.com/SynBioDex/SBOL-visual/tree/master/Glyphs/cleavage-site). The conversion would then incorporate the glyphs also for "Complex" automatically.

Secondly, it is difficult to parse the Markdown files when there are multiple recommended terms. Files generally include the order and types of these terms using the "in order:" text followed by different types. Examples are used in location and cleavage-site files. Can I change the file for the complex as below so that I can create identifiers for these terms? The change is indicated in bold.

The RECOMMENDED glyph for a complex is a composite of the glyphs for the molecules of comprising the complex. For example, a protein bound to a small molecule, a guide RNA, or another protein (in order: protein-small molecules, protein-guide RNA, protein-protein):

Issue 2: Stop site: The conversion assumes that the number of rows for the ontology terms represent the number of recommended terms. The Markdown content of the stop-site has three rows and I assume this was caused unintentionally. I will combine the two rows below in one. Please confirm. SO:0000319 Stop Codon; SO:0000327 Coding End, Translation Termination Site, Translation End

Issue 3: Five-Prime-overhang Can we use a single SO term for the 5 prime overhang site. At the moment, this markdown file is not converted since it is about 5' overhang site but at the same time it describes an opposite. Alternatively, how about using SO:0001695 (restriction_enzyme_single_strand_overhang) parent term only. In SBOL Visual none of the other glyphs provide any information about directionality and we all assume that the default direction id from 5' to 3' https://github.com/SynBioDex/SBOL-visual/tree/master/Glyphs/five-prime-overhang The same issue also applies to the five-prime-sticky-restriction-site glyph. We can remove the 3' SO term. Alternatively, we can replace the existing terms with the parent SO:000169 (sticky_end_restriction_enzyme_cleavage_site term) term. https://github.com/SynBioDex/SBOL-visual/tree/master/Glyphs/five-prime-sticky-restriction-site

Issue 4: stop-site- This is a minor issue. Identifiers are created using the Markdown headers. I suggest changing the label of stop-site (https://github.com/SynBioDex/SBOL-visual/tree/master/Glyphs/stop-site) from "Stop Site (Transcrition/Translation End Point)" to "Stop Site". Alternatively the typo can be corrected and I can remove nonalphanumberic characters while creating the identifiers.

goksel commented 5 years ago

Issue 5: Issue 5: The order of external ontology terms should follow the same order for which the recommended glyphs are listed. This is necessary for the automated creation of the ontology. E.g. https://github.com/SynBioDex/SBOL-visual/tree/master/Glyphs/cleavage-site

SO terms for this glyph should be listed as follows: SO:0001688 (Restriction Enzyme Cleavage Junction), SO:0001687 (Restriction Enzyme Recognition Site)

SO:0001977 (Ribonuclease Site)

SO:0001956 (Protease Site)

jakebeal commented 5 years ago

Recording decisions from Jake & Goksel discussion at IWBDA:

  1. Complex will be handled as an exception case, because it doesn't really have a meaningful generic. For the secondary concern: Goksel will add an "(in order, ...)" statement
  2. The two rows are being combined in the source, so that the source has the same number of lines as the rendering.
  3. five-prime-overhang will become overhang, with subtypes for 5' and 3', without a base glyph, just like we do for location glyphs. We will also make the specification variants for 3' as well as 5'. We will do the same thing for five-prime-sticky-restriction-site
  4. Typo is fixed; the parentheses will not stay part of the term.
  5. The stem-top glyphs should always be ordered DNA, RNA, Protein. They need to be changed to explicitly say, "DNA: SO:xxxx", "RNA: SO:xxxx", "Protein:xxx"
jakebeal commented 5 years ago

@goksel How close are we to completion on this?

goksel commented 5 years ago

Adding another issue: Issue 6: Terms for the stability elements are not ordered correctly. The order should follow DNA, RNA, Protein.

goksel commented 5 years ago

@jakebeal I just made a pull request, fixing these issues: https://github.com/SynBioDex/SBOL-visual/pull/72

Below, I copied the text from the pull request for future reference.

I updated the readme files to fix the issues listed here: https://github.com/SynBioDex/SBOL-visual/issues/15

I believe all six issues are now fixed.

EXCEPT: The actual PNG glyphs for 3' overhang and sticky restriction sites STILL need to be created. The readme files were created correctly using the proper file names. Once the PNG files are replaced, this issue can be closed.

4 images in total for single and double stranded DNA. 2 for the overhang term and 2 for the restriction site term.

@jakebeal would you be able to create them?

jakebeal commented 5 years ago

@goksel I've got a crazy busy day today --- whichever of us gets to it first?

jakebeal commented 5 years ago

I believe this is all fixed and merged into branch https://github.com/SynBioDex/SBOL-visual/tree/ontology-update

Are your ontology files now ready to be merged in, replacing the old ontology material?

goksel commented 4 years ago

Ontology files are ready. We can perhaps discuss where to put them. Do you want me to copy only the ontology file? We still need to produce the 4 images I mentioned. The SVG tool I tried did not work well on my MAC and I could not generate those glyphs.

jakebeal commented 4 years ago

There is an Ontology directory that currently contains the very obsolete 1.0 material. That would seem a logical place to put the new material. As for what to put in, my goal is to have the SBOL-visual repository contain everything that is needed for maintenance of the standard, so not just the ontology file but whatever support is needed for generating updates of the ontology.

I'm a bit confused regarding the glyphs: I saw the 3' overhang glyphs, and have added the 3' sticky glyphs. Are these not the ones you're looking for?

goksel commented 4 years ago

I added the images as place holders for now. They are not reflections. I created them by rotating the 5' images 180 degree. Should not the space on the 3' end be on the top left? When I rotate them, they end up on the top right!

The ontology related files include both the ontology and the scripts to generate them. If we use the SynBioDex/SBOL-visual repository also for the code, it may be difficult to manage. Issue management for code vs spec, and release managements will be different. I suggest moving the code to a different repository but the final decision is yours. Shall we discuss this briefly in today's SBOL 3 working group meeting?

jakebeal commented 4 years ago

Recording results of discussion: images are fixed, we'll start with just the ontology, and add the README generator script when it's available. The scripts for generating ontology from README are transient only.

jakebeal commented 4 years ago

@goksel Will we get the rest of the update soon?

goksel commented 4 years ago

@jakebeal, yes hopefully soon, i updated the html files today and then I started updating the html version to display the actual glyphs. I will submit the updates, depending how this last task goes.

goksel commented 4 years ago

@jakebeal I just made a pull request. Please approve the merge, if you are happy. I am including the description of the pull request also here so that we have a record.

Added the ontology files for both RDF and HTML versions under the Ontology/v2 folder. We may consider changing the folder structure and perhaps creating the v1 folder for the previous ontology files. The image paths are hardcoded to look into the Master branch and that is why some of the images (the recent ones that are not in master) are not displayed when the HTML page is viewed under the ontology-update branch. This will be fixed when the ontology-branch is merged into master.

I also updated the simple chemical README file so that I could generate three recommended terms.

We can also consider adding github pages to the project so that we can serve the HTML file directly, although the HTML view for the full GitHub page would only be for the Master branch. An example of how this would work is here: https://dissys.github.io/sbol-visual-ontology/sbol-vo.html

For future reference, just in case, the source project to convert the ontology from the markdown version is at https://github.com/dissys/sbol-visual-ontology and is publicly available.

cjmyers commented 4 years ago

This is looking good. By the way, I finally began the transfer of sbols.org http://sbols.org/ to our control. We should be able to redirect from there soon. Note sbolstandard is now on a Digital Ocean Droplet, so we can directly host the ontologies there easily, if you like. I believe you just need to give your public key to Zach to get access.

On Oct 7, 2019, at 3:25 PM, Goksel Misirli notifications@github.com wrote:

@jakebeal https://github.com/jakebeal I just made a pull request. Please approve the merge, if you are happy. I am including the description of the pull request also here so that we have a record.

Added the ontology files for both RDF and HTML versions under the Ontology/v2 folder. We may consider changing the folder structure and perhaps creating the v1 folder for the previous ontology files. The image paths are hardcoded to look into the Master branch and that is why some of the images (the recent ones that are not in master) are not displayed when the HTML page is viewed under the ontology-update branch. This will be fixed when the ontology-branch is merged into master.

I also updated the simple chemical README file so that I could generate three recommended terms.

We can also consider adding github pages to the project so that we can serve the HTML file directly, although the HTML view for the full GitHub page would only be for the Master branch. An example of how this would work is here: https://dissys.github.io/sbol-visual-ontology/sbol-vo.html https://dissys.github.io/sbol-visual-ontology/sbol-vo.html For future reference, just in case, the source project to convert the ontology from the markdown version is at https://github.com/dissys/sbol-visual-ontology https://github.com/dissys/sbol-visual-ontology and is publicly available.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/SynBioDex/SBOL-visual/issues/15?email_source=notifications&email_token=AA2YH5343LUEANDHN5GNALDQNNBEVA5CNFSM4D6FL4VKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAQQ6LI#issuecomment-539037485, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2YH5Y2N3ECIYTPTEEQRXLQNNBEVANCNFSM4D6FL4VA.

goksel commented 4 years ago

Fixed an issue, regarding the mapping of Complex glyphs in the ontology. The pull request also includes this update. Chris, Zach created me an account but I could not test it yet. Hopefully, I will do it soon.

jakebeal commented 4 years ago

Thank you, @goksel ; I have made a couple of small requests before finalizing the merge.

goksel commented 4 years ago

I have removed the sbol visual ontology v1 files. The pull request includes all these changes.

I am not sure what is the best way to test the ontology whether to open a new issue or carrying on with this discussion thread. @jakebeal do you have a suggestion, shall we open a new issue?The HTML view offers an easy to browse approach to go through different properties of the ontology. After the pull request is merged, the HTML file can be found at SBOL-visual/blob/ontology-update/Ontology/v2/sbol-vo.html. The actual ontology (SBOL-visual/blob/ontology-update/Ontology/v2/sbol-vo.rdf) can be viewed using Protege (https://protege.stanford.edu/products.php).

I setup the HTML view for the GitHub repository. This means, we can use HTML to view individual glyph files in the "master" branch: http://synbiodex.github.io/SBOL-visual/Glyphs/aptamer/

After the pull request is merged, we can also serve the ontology's HTML version directly using http://synbiodex.github.io/SBOL-visual/Ontology/v2/sbol-vo.html.

jakebeal commented 4 years ago

Thanks, Goksel; I will review the updated pull for merge.

Do please open an issue for ontology testing --- I do want to have some sort of automated validation at least that the ontology is valid and able to generate correctly.

goksel commented 4 years ago

I created a new issue to test the ontology: https://github.com/SynBioDex/SBOL-visual/issues/76

Let's close the issue 15 when the pull request is merged.

jakebeal commented 4 years ago

I merged the pull request yesterday, so am closing now.