SynBioDex / SBOLTestSuite

6 stars 10 forks source link

SBOL3 annotation example uses invalid displayId #22

Closed tcmitchell closed 3 years ago

tcmitchell commented 4 years ago

The annotation examples in SBOL3/entity/annotation use an invalid displayId parts.igem.org. In section 6.1 Identified, under "The displayId property":

If the displayId property is used, then its String value MUST be composed of only alphanumeric or underscore characters and MUST NOT begin with a digit.

The displayId uses a dot character (.), which is an illegal character in a displayId. Here is the offending line in SBOL3/entity/annotation/annotation.nt:

<http://parts.igem.org> <http://sbols.org/v3#displayId> "parts.igem.org" .

My suggested fix is to change the identity to http://parts.igem.org/Main_Page and change the displayId to Main_Page. This would match the value of the http://parts.igem.org/website property in the entity. If this is acceptable I'd be happy to submit a pull request.

tcmitchell commented 4 years ago

There's a second issue with displayId in these test files as well. https://sbolstandard.org/examples/BBa_J23119 has displayId BBa_J23119 part, which does not match the identity and has a space in it. Perhaps this got exchanged with the name property, which is a valid displayId for this entity?

Here's how it looks in annotation.nt:

<https://sbolstandard.org/examples/BBa_J23119> <http://sbols.org/v3#displayId> "BBa_J23119 part" .
<https://sbolstandard.org/examples/BBa_J23119> <http://sbols.org/v3#name> "BBa_J23119" .

And here is perhaps how it should look:

<https://sbolstandard.org/examples/BBa_J23119> <http://sbols.org/v3#displayId> "BBa_J23119" .
<https://sbolstandard.org/examples/BBa_J23119> <http://sbols.org/v3#name> "BBa_J23119 part" .
goksel commented 4 years ago

Thanks Tom, I implemented the following change for now. If the URL does not include a PATH (e.g. "/part1") or fragment (#part1), then displayId is not inferred. We can then apply validation rules to check if the SBOL document is valid or not.

But I am not sure how to proceed regarding application specific entities vs SBOL entities. Is the URL in the following example valid or not? It may not be valid since the URL does not include the display string. However, this entity is an application specific information. Should we still treat the following as an SBOL entity and require displayId in the URL? If we enforce the displayId in the URL then this would make it difficult to incorporate application specific annotations. http://parts.igem.org a sbol:TopLevel , igem:Repository ; igem:website "http://parts.igem.org/Main_Page" ; sbol:description "Registry of Standard Biological Parts" ; sbol:name "iGEM Registry" .

Regarding the displayId in the URL vs displayId as a property, is there any explicit requirement for them to be the same? We may need to clarify this in the spec.

tcmitchell commented 4 years ago

I'm not an expert on specifications and I had no hand in writing the SBOL 3 specification, so others may have a better answer. Here is my case for why displayId is required and must match the URL. In this comment I am referring to the most recent published SBOL 3.0 specification, dated April 1, 2020.

In Section 6.1 Identified, in the section titled "The displayId property", the third paragraph states:

Note that for objects whose URI is a URL, the requirements on URL structure in Section 5.1 imply that the displayId MUST be set.

In the cited case, the object's URI is http://parts.igem.org, which I think we can agree is a URL. Thus displayId is required, in this case, according to the specification.

Going to Section 5.1 Uniform Resource Identifiers, the first bullets starts with the following statement:

A TopLevel URL MUST use the following pattern: [namespace]/[local]/[displayId], where namespace and displayId are required fragments, and the local fragment is an optional relative path.

While the object in question is an annotation, or application specific information, in my eyes it is also a TopLevel, and thus must conform to all the rules required of a TopLevel. One of those rules is that the URL conform to the above statement. In fact, as I write this, I now think that the URI of the object is invalid according to the specification because it does not contain a displayId.

In closing, I want to be clear that this is my opinion, and others who are more knowledgeable than me may have different opinions. Thank you for raising this question. It is an interesting topic of debate.

cjmyers commented 4 years ago

I'm completely agree with Tom's assessment. DisplayId is must match the one in the URI.