sys-bio / tellurium

Python Environment for Modeling and Simulating Biological Systems
http://tellurium.analogmachine.org/
Apache License 2.0
108 stars 36 forks source link

Correct pattern of recognized KiSAO ids #488

Closed jonrkarr closed 3 years ago

jonrkarr commented 3 years ago

tellurium recognizes KiSAO terms with the id pattern KISAO:\d{7}. However, the correct pattern is KISAO_\d{7}.

This should be corrected. To retain compatibility with erroneous SED-ML files, the : pattern could also be supported.

jonrkarr commented 3 years ago

Upon further reflection, this doesn't need to be changed. While its confusing that SED-ML doesn't use the actual ids of KISAO terms, SED-ML does have a clear convention and this convention can be mapped to the correct ids. It is quite confusing though. SED-ML's use of KISAO looks a lot like a reference to an ontology term. Strictly speaking however, SED-ML captures information which can be transformed into the id of KISAO term. This same issue is repeated with SBO terms in SBML.

luciansmith commented 3 years ago

I actually already made the change to tellurium to allow either input. It doesn't matter for tellurium-the-program, since it only reads SED-ML and doesn't produce it. But phrasedml would have to change to produce the new format. (libSEDML just takes a user-defined string, and does no parsing at all to enforce any format.)

Does this mean that you think that https://github.com/SED-ML/sed-ml/issues/71 should be dropped?

jonrkarr commented 3 years ago

Yes, I think SED-ML/sed-ml#71 can be dropped. I'll write the same there.

Personally, I feel that the : in KISAO ids is a big potential source of confusion. But, since its fairly engrained in the community at this point, I think its probably difficult to change. If it only affected KISAO, I'd probably push more. Since the usage of SBO in SBML has the same problem, it think we have to stick with the : variant of KISAO ids.