SynBioDex / SBOL-visual

The reference implementation of the SBOL Visual standard
Other
31 stars 15 forks source link

SEP V005: Ambiguities and Variants #9

Closed jakebeal closed 5 years ago

jakebeal commented 6 years ago

SEP V005: Ambiguities and Variants

SEP
Authors Jacob Beal (jakebeal@ieee.org)
Editor
Type Specification
SBOL Visual Version 1.1
Status Draft
Created 30-Aug-2017
Last modified 17-Sep-2017

Abstract

A number of potential variants of existing glyphs have been proposed, and we need to put them to an up-or-down vote. We also need to clarify the position and/or interior of some of the existing glyphs.

There are ten glyphs potentially affected by these proposals:

Table of Contents

1. Rationale

Each glyph variant detailed below in its specification has been provided with an individual rationale for that glyph. Examples are also embedded within each proposal.

2. Specification

Assembly Scar

The assembly scar glyph is an "equal sign" image, the pattern produced by the union of a 5' sticky end and 3' sticky end glyph. The scar will cover the backbone, creating a visual break suggesting the potential disruption associated with a scar:

glyph specificationglyph specification

CDS

The coding sequence glyph is a "box" with one side bent out arrow-like to show direction. Its recommended backbone alignment is to the middle:

glyph specification

A block arrow variant is already commonly used in diagrams:

glyph specification

Its recommended alignment will also be to the middle.

Restriction Enzyme Recognition Site (Cleavage Site)

Recommended backbone alignment is centered on backbone:

glyph specification

5' Overhang Site

The 5' overhang site glyph is an image of a strand of DNA extended on the 5' edge of its forward strand:

glyph specification

With a double-stranded backbone:

glyph specification

5' Sticky Restriction Site

The 5' sticky restriction site glyph is an image of the lines along which two strands of DNA will be cut into 5' sticky ends. Vertical position with respect to the backbone is between a double backbone and in a break in a single backbone:

glyph specificationglyph specification

Insulator

The insulator glyph is a box inside another box that isolates it from its environment. Its interior is only the inner box:

glyph specification

The position of the back bone will be below the backbone, as insulators are often used with respect to a construct associated with a particular strand (e.g., a promoter):

glyph specification

Operator

The operator glyph will be replaced by an open "cup" as in the binding sites of the proposed protein language:

glyph specification

Origin of Replication

The origin of replication glyph is a circle suggesting the "bulge" opened in a piece of circular DNA when replication is beginning:

glyph specification

User Defined

The user defined component glyph is a plain rectangle. The backbone is RECOMMENDED to be placed at the bottom:

glyph specification

3. Examples

See examples in individual glyph proposals.

4. Backwards Compatibility

All proposals either provide clarity on existing ambiguous glyphs or else propose new non-conflicting variants.

5. Discussion

The following proposed options have been considered, but do not have strong support and are thus being removed from consideration unless they pick up significant advocacy. They may be revisited in the future.

Assembly Scar

Assembly Scar might be on on either side of or above the backbone:

glyph specification

glyph specification

CDS

CDS backbone alignment might be to the middle:

glyph specification

A number of variants have been proposed; their alignment will match that of CDS except when otherwise noted.

User Defined rectangle:

glyph specification

Other alternatives include a chevron and asymmetric "halved" versions of the current CDS or block arrow:

glyph specificationglyph specificationglyph specification

Restriction Enzyme Recognition Site (Cleavage Site)

Site on top of backbone:

glyph specification

5' Sticky Restriction Site

Vertical position with respect to the backbone might above the backbone:

glyph specification

Insulator

Insulator's fill might also be no interior, outer, or both boxes filled:

glyph specificationglyph specificationglyph specification

The position of the back bone might also be centered, or hovering below:

glyph specificationglyph specification

Two possible alternate glyphs have also been proposed:

glyph specification

glyph specification

Operator

The operator glyph was a box marking a place:

glyph specification

The glyph is proposed to be generalized to Binding Site, which also suggests it might be an open "cup" as in the binding sites of the proposed protein language. The bottom and hover options for alignment do not currently have support:

glyph specificationglyph specification

Its recommended backbone alignment might be middle, bottom, or hovering above:

glyph specificationglyph specificationglyph specification

The notion of binding site might also simply be indicated by generalizing Restriction Enzyme Recognition Site to simply be a generic Recognition Site:

glyph specificationglyph specification

Origin of Replication

The origin of replication might also be above the backbone:

glyph specification

Terminator

A number of variants have been proposed. Some add asymmetry by:

glyph specificationglyph specificationglyph specification

Other variants make function more symbolic by:

glyph specification

glyph specificationglyph specificationglyph specification

User Defined

The user defined component might be aligned at the middle, or hovering under the glyphs:

glyph specificationglyph specification

Copyright

CC0
To the extent possible under law, SBOL developers has waived all copyright and related or neighboring rights to SEP V005. This work is published from: United States.

jakebeal commented 6 years ago

My initial take:

cjmyers commented 6 years ago

The touching/symmetric on backbone question is complicated by the fact that the backbone can often look like a part of the glyph, since the line may be same thickness as lines in the glyph (should perhaps render them like this in the SEP to make this point more clear. This means that scar actually looks like three lines which is very strange looking. One solution might be that the area between the two lines of the scar is assumed to be filled. This would then cover up the backbone making it look better. If we do something like this, I would be find with scar, restriction site, overhang, sticky end all being symmetric on the backbone.

I'm not really sure I like CDS and Operator touching the backbone rather than symmetric. For Operator in particular, if we go with the cup, then we have the bottom line blending in with the backbone, so it will really look like two ends sticking up (or down). If it is symmetric on the strand with a fill, I think it would look pretty good. CDS I could really go either way, but I think it is fine to allow it either touching or symmetric.

Agree we need CDS block arrow option.

My biggest comment is that I would like User Defined to be repurposed to Engineered Region, then have User Defined be truly user defined. Namely, if there is No Glyph Assigned, then one should come up with their own non-comflicting glyph. We suggest a new User Defined default glyph for rendering software, but it should be a glyph that would certainly never need to be repurposed to support another type of feature. Something akin to the diamond/question-mark proposed in V003 for unspecified.

swapnilb commented 6 years ago

@cjmyers regarding scars: pigeon does it "front of," but filled works too I suppose.

image

jamesamcl commented 6 years ago

Regarding the hairpin loop. I think a version of the glyph on the right that's wider and less tall would look better.

swapnilb commented 6 years ago

Regarding the Origin of Replication, I believe the symbol suggested may have been something like this:

image

It's not a circle, but an annular shape. Given these are all origins of x, it should relate to the Origin of Transfer that we just proposed, which looks like this:

image

jakebeal commented 6 years ago

@cjmyers

One solution might be that the area between the two lines of the scar is assumed to be filled. [snip]

Following this thought, I've now put in a version that has it not "filled" but "empty", specifically putting a break in the backbone, much like the proposed "broken backbone" version of sticky restriction site. This is consistent with how @swapnilb showed Pigeon illustrating overhang sites as well.

I'm not really sure I like CDS and Operator touching the backbone rather than symmetric.

I see your point on operator, and am OK with it being symmetric. That would also be consistent with its proposed usage in protein language. CDS, on the other hand, is a "large" glyph and I find it much more compact and interpretable to have it "up" on the same side as the promoter it is typically joined with.

Remember, these are only RECOMMENDED relations, so if somebody has good reason to make a different positioning choice within the bounding box, it is always allowed.

My biggest comment is that I would like User Defined to be repurposed to Engineered Region, then have User Defined be truly user defined. [snip]

This will certainly be a point of discussion for the follow-on for SEP V003, but that should be a new discussion in a separate SEP. This SEP is only considering the positioning of the box glyph (whatever it might mean) with respect to the backbone.

jakebeal commented 6 years ago

@udp

Regarding the hairpin loop. I think a version of the glyph on the right that's wider and less tall would look better.

I'm unclear: are you responding to this as a proposal for a hairpin loop, or as a proposal for "terminator"? This is not a proposal for a hairpin loop glyph, but an alternative terminator glyph proposal.

jakebeal commented 6 years ago

@swapnilb

It's not a circle, but an annular shape.

Under the styling rules of SBOLv, for purposes of these glyphs these two notions are indistinguishable.

Given these are all origins of x, it should relate to the Origin of Transfer that we just proposed

I agree that these two should have the same vertical positioning. Do you think that positioning should be symmetric with or above the backbone?

cjmyers commented 6 years ago

My biggest comment is that I would like User Defined to be repurposed to Engineered Region, then have User Defined be truly user defined. [snip]

This will certainly be a point of discussion for the follow-on for SEP V003, but that should be a new discussion in a separate SEP. This SEP is only considering the positioning of the box glyph (whatever it might mean) with respect to the backbone.

Ok, I agree it should be touching the line, and we can discuss what it is used for later.

Chris

swapnilb commented 6 years ago

@jakebeal As to positioning, I think it should be left to users, with the explanation of what position is generally intended to convey, given in the proposal.

I don't see what you mean by annular ring is same as circle. Is/Isn't fill part of the definition? I am saying that the Origin of X (Replication and Transfer as of now) are/should be both annular rings, meaning that fill is restricted to annular ring. The disk circumscribed is not filled. Of course, users may obfuscate the removed disk by coloring it the same as the ring.

jakebeal commented 6 years ago

@swapnilb Actually, per the adopted SEP V001, we need to recommend some preferred position.:

Every glyph for representing a Component MUST have a grey horizontal line indicating the RECOMMENDED vertical positioning of the glyph on a nucleic acid backbone.

Users are, of course, free to ignore the recommendation if they have good reason to do so.

On the different of circle vs. annular ring: I do see how fill does distinguish. The current proposal for ORI-T is for a circle-with-arrow, which would match the current version of ORI. That also matches the language of the original proposal in the linked thread, as well as the literature version linked from the thread. This does not, of course, preclude proposing to change these glyphs from circle to ring, but I would want to hear what the argument is in favor of doing so.

swapnilb commented 6 years ago

The current proposal for ORI-T is for a circle-with-arrow, which would match the current version of ORI. That also matches the language of the original proposal in the linked thread, as well as the literature version linked from the thread.

OK, that's correct. (But annular ring and circle are not indistinguishable under styling conventions.)

Every glyph for representing a Component MUST have a grey horizontal line indicating the RECOMMENDED vertical positioning of the glyph on a nucleic acid backbone.

Does this rule out having two recommendations?

jakebeal commented 6 years ago

Yes: there is precisely one RECOMMENDED vertical position.

swapnilb commented 6 years ago

@jakebeal Thanks. Would appreciate if you'd point me to where it says that.

jakebeal commented 6 years ago

In the statement:

Every glyph for representing a Component MUST have a grey horizontal line indicating the RECOMMENDED vertical positioning of the glyph on a nucleic acid backbone.

It is not "one or more horizontal lines" but instead is "a" horizontal line, meaning one.

swapnilb commented 6 years ago

Thanks.

The article "a" does not rule out more than one. It simply says at least one. ("x*x - 2 = 0 has a solution.") If you'd like to have exactly one, the spec should say "a single" or "exactly one" or some such.

So I propose we recommend both positions.

jakebeal commented 6 years ago

The intent was always to be precisely one. As that is apparently not clear, I will change "a" to be "exactly one".

swapnilb commented 6 years ago

I don't agree with that intent. Others might not too. Are you allowed to edit the spec willy nilly?

jakebeal commented 6 years ago

That's what was discussed at HARMONY, and we had a long discussion about this question there. The issues with multiple recommendations are:

  1. If you recommend multiple positions, that doesn't actually break the ambiguity that the recommendation is intended to break.
  2. In nearly every case, there are really only two positions that people have seriously considered. If recommendation is not for exactly one position, it doesn't help.

Furthermore, to your statement:

"The article "a" does not rule out more than one."

Actually, it does: "a" is a singular article. If more than one was allowed, then the spec would say: "at least one."

swapnilb commented 6 years ago

Actually, it does: "a" is a singular article. If more than one was allowed, then the spec would say: "at least one."

Actually, it doesn't; see my example above: "x^2 - 2 = 0 has a solution." "There MUST be a RECOMMENDED route." As is clear, ordinarily, "a" means "one or more." If you'd like "a" to mean "exactly one", you should define these things in the spec preamble.

When there is need to narrow the meaning of ordinary words, it is a good practice to use more specific, explicitly constraining language. When no constraints are intended, no additional language need be used, as the meaning is ordinarily clear. This is standard practice in writing specs.

Regarding the issue at hand; it is not always possible to break ambiguity and provide a useful recommendation grounded in any thoughtful analysis. By forcing a recommendation, it makes the spec less robust, not more. Therefore, it should be possible to NOT provide a recommendation, also a standard option in most formal documents.

Therefore, I propose that it be possible to NOT recommend a position, and that we do so for the OriT glyph.

jakebeal commented 6 years ago

@swapnilb Please feel free to open a new SEP and new discussion on the list to propose changing the cardinality of backbone position recommendation from SEP V001's specification of "exactly one" to a modified cardinality of "zero or more."

For now, however, my understanding, based on the community's prior discussions, is that the cardinality of backbone position recommendation, as specified in SEP V001, is "exactly one," and I propose to continue discussion of this SEP based on that understanding.

rsc3 commented 6 years ago

I think we should definitely support submissions of Glyph sets (like fonts). We advertised this in the PLOS paper. If not here, please make it easy to add later

jakebeal commented 6 years ago

@rsc3 While this SEP does not address the notion of font-like glyph sets, I do not believe that it conflicts with it in any way either. That should be readily addressed in a future SEP, which I would invite you to lead in writing.

jakebeal commented 6 years ago

I have updated with my current understanding of the state of discussion:

Resolved symbols:

Need more discussion:

Can people please weigh in on Assembly Scar, Insulator, and ORI?

cjmyers commented 6 years ago

I think scar, insulator, and ORI should be symmetric covering the backbone.

jakebeal commented 6 years ago

@cjmyers I would be happy with those for scar and ORI.

For insulator --- what is your reason for suggesting symmetric vs. bottom? I have thought of insulators as being more related to a particular strand (i.e. the things you're trying to insulate), but don't have a strong preference.

Also, regarding insulator: any thoughts on which of the four fills makes most sense?

cjmyers commented 6 years ago

Sorry no strong opinions on insulator. Perhaps it should be bottom, since it is symmetric shape and cannot determine orientation otherwise.

jakebeal commented 6 years ago

I have resolved Assembly Scar, ORI, and insulator's alignment per this discussion, and have also added an example showing how assembly scar covers a double-stranded backbone.

On the interior for insulator, I suggest that we choose the option where the inner box is the interior. My reasoning:

Please flag concerns if you disagree with any of these changes; otherwise, I think we are nearly ready for a vote.

jakebeal commented 5 years ago

Accepted and integrated, and thus closed per SBOL procedure in updated SEP 001.