sys-bio / libOmexMeta

libOmexMeta is a library aimed at providing developer-level support for reading, writing, editing and managing semantic annotations for biosimulation models.
https://sys-bio.github.io/libOmexMeta/
Apache License 2.0
8 stars 6 forks source link

Issues identified by Max Neal and John G: Problem with supplementary content #22

Closed jhgennari closed 3 years ago

jhgennari commented 3 years ago

Co-authors:

I hope that folk have had (or are having) a nice Holiday. Here in Seattle, it was great to get a quick dusting of snow (about an inch where I live) a few days before Christmas. Now, as is more typical for winter in Seattle, we've got a forecast of four straight days of rain at 45 degrees.

As you'll see below, Max has nicely gone thru the supplemental material very carefully, uncovering a fairly large number of typos and problems. Most of these are small ones I can easily take care of, however, there are two bigger problems and issues that I would like to get consensus on before submitting the paper.

(1) The supplemental material starts by proclaiming that there are two parts, "A" and "B", and that the former is a tutorial and that the latter is more "example-focused". However, the document doesn't actually have anything labeled "A" or "B".

Also, as you'll see below, Max found this B section confusing, and perhaps redundant. In my view, I would prefer to cut section B from the supplemental material for the submission. It might be very useful material to keep somewhere on the web page for SBML developers, but I think it's too detailed for the submission. I also think it's not balanced for us to have a detailed SBML example without a matching CellML example.

So I would vote to cut this from the supplemental material. Thoughts?

(2) As Max points out, section 3.1.3 doesn't do the Nerst potentials correctly. The problem isn't primarily in the libOmexMeta code -- the problem is mostly the SBML. In particular, a membrane potential shouldn't be connected to any particular reaction. In SBML it has to be a parameter. In our approach to annotation, it's a voltage with sources and sinks. In this example, these are the CA ions on either side of the membrane. (See also Max's comments below). So I think that in the supplemental material, the SBML can become much shorter, listing just the species and the parameters.

Unfortunately, I believe the implementation of "new_energy_diff()" will have to be modified. The arguments will look the same, but the resulting triples should be different. Ciaran, attached find a separate document that describes how this should behave. (Max, PLEASE also check. Also, if Max or David could could confirm if we've got the convention right about the source-sink directionality...)

I don't think there is any debate about how (or if) to make this change, but it is a question about versioning and releases. In the grand scheme of things, I think this is a minor issue, and shouldn't hold up the submission to the Journal, but I'm happy to hear debate on this point.

I think that the rest of the issues Max discusses below are relatively minor. I will work over the next day or two to incorporate these changes in a github branch and ask for a pull request when I'm done.

There is one minor Max suggestion that I'll mention here because I'd like to vote against it. Max suggests that it seems wrong to spell the library "libOmexMeta", when OMEX is an acronym. Indeed, we are careful to capitalize OMEX when used by itself, as well as in the OMEX Metadata specification document. However, I think it is fine, and even preferable to call the library "libOmexMeta" (rather than "libOMEXMeta")--I simply think it looks better, and the OMEX source and acronym are spelled out and credited in the text of the manuscript.

Thanks all for continuing to work on this paper. Stay tuned for a pull request for the updated supplemental material.

-John G.

On 12/29/2020 1:25 PM, Maxwell Neal wrote:

Hi John,

After having a close look at the supplementary notebook for the libOMEXmeta paper, here are my edits/suggestions:

  • In general, make sure that "SBML" is capitalized when used throughout the text.

Paragraph 1:

  • Shouldn't libOmexMeta be written as libOMEXmeta, given that OMEX is an acronym? If so, change in paper and supp. material, and in all online sources (Git sites, e.g.)

  • add hyphen between example and focused (example-focused)

  • "realistic", not "realistical"

Paragraph 2:

  • Change 1st sentence to "We anticipate that the users of OMEX metadata libraries will be programmers rather than biologists."

Part A1: -"creation date" not "date created"

  • cut "nothing more than"

  • the URI for the curator annotation should be formatted similar to this real-world orcid URI (mine): https://orcid.org/0000-0002-2390-6572. Currently, the URI uses the orchid.org domain (note the added "h") and the unnecessary /orcid:. This also needs to be addressed for the creator annotation.

  • the URI used in the hasTaxon annotation is not formatted correctly. It should be http://identifiers.org/taxonomy/9895

  • the URI used in the pubmed annotation is not formatted correctly. It should be https://pubmed.ncbi.nlm.nih.gov/12334/

Part A2: Last sentence: there's a "th" that's missing an "e"

Part A3:

  • Change first sentence to "Composite annotations are collections of singular [note singular is misspelled] annotations with a predefined structure and are used to construct precise physiological concepts that are not represented as singular terms in current knowledge resources."

  • Second sentence and enumerated list: "annotations" not "annotation"

Part A3.1.1:

  • Change 3rd and 4th sentences of 1st paragraph to "For instance, a cardiovascular fluid flow model might simulate the fluid volume or fluid pressure inside the left ventricle. A chemical kinetic model might simulate the chemical concentration or the amount in absolute numbers of a physical entity, such as glucose or ethanol."

  • Change "present in the code" in 1st sentence of 2nd paragraph to "explicitly represented in the code".

  • The URIs for OPB terms are formatted incorrectly in the RDF output from example 3.1.1. They should read https://identifiers.org/OPB/OPB_00340. Currently, they do not resolve when entered into a browser. Please correct this issue throughout the document, as there are multiple places where OPB URIs are formatted incorrectly.

  • Consider using nucleoplasm, not nucleus in this example. (The nucleus includes a membrane whereas the cytoplasm doesn't. Using nucleoplasm would be a bit more consistent in terms of the biology.)

  • In paragraph after code example, change "metaid" to "metadata ID (metaid)"

  • The RDF snippet below the sentence that starts "Internally, libOmexMeta creates a new RDF subject node local:EntityProperty0000..." appears to be incorrect. The local:EntityProperty0000 object does not have an or an triple in the RDF snippet derived from example 3.1.1. Shouldn't this snippet read

local:EntityProperty0000 bqbiol:isPropertyOf http://omex-library.org/GlucoseTransport3_1_1.omex/glucose_transport3_1_1.sbml#glucose_c ; bqbiol:isVersionOf https://identifiers.org/OPB:00340 . ?

Example 3.1.2.1

  • The more specific "chemical molar flow rate" term from the OPB should be used in the annotation example: https://identifiers.org/OPB/OPB_00592

  • In the sentence that starts "Like with the physical entity type composite annotations...", change "Like" to "As".

  • Change "are metaids to" to "point to metaids on"

  • Last sentence of section: change "chemical flow rate" to "chemical molar flow rate"

Section 3.1.3

  • Title: "an" not "a"

  • Membrane potentials are not reactions, so this SBML example needs to be revised. The elements should be replaced with a single SBML that represents the potential. Potentials are not processes, they are energy differentials, and SBML currently provides no constructs for explicitly representing an energy differential. So for an SBML model to include a membrane potential, it will need to be represented as a . Thus, the physical property object in the resulting RDF graph should use the metaid on the as the URI fragment, and should not be a local term. The physical property resource should be a local RDF resource representing the energy differential. This energy differential resource should then be linked to the sinks and sources with the <hasSource/SinkParticipant> relations. I think the rest of the triples in the current RDF version are OK. That is, the source and sink instantiations, and their physical entity references to the explicit species in the SBML code.

Example 3.2.1

  • The variables "main.MembraneVoltage" and "main.ReactionRate" could be removed for readability/simplicity.

  • Check that OPB URIs are formatted correctly and that they resolve

  • I would cut the five lines that occur after the paragraph that ends with "...to the information that annotates this physical entity." The RDF snippet just repeats what is immediately above, and I don't think that we should suggest that relations are only needed when distinguishing the same thing that is in different compartments because even if the model only contained blood in the lumen of the artery, the relation is needed to precisely define what the model is representing. Software tools that assess the semantics on multiple models can then distinguish this model from, say, a model of blood in the right ventricle.

Section 3.2.2

  • Suggest removing the "main.Volume" and main.MembraneVoltage" variables from the CellML code for simplicity.

  • We should make sure that the SBML and CellML examples for a physical process annotation are semantically equivalent and that there is a 1-to-1 mapping between RDF statements that are written out. For example, the statements for SBML species indicating that they are part of the cytosol are missing from the RDF in example 3.1.2.

  • For clarity, use chemical species names for the reaction participants instead of "source", "sink" and "mediator".

  • Make the reaction identical to the SBML example: use two sources and two sinks. Currently there is only one of each.

  • GO:cytoplasm is used in example 3.1.1 but FMA:cytosol is used in other examples. Should we be consistent in which knowledge resource we use for these physical entities? i.e. maybe we should use only terms from the FMA? Or only from GO?

Section 3.2.3

  • Title: "an", not "a"

  • lowercase the "E" in "Energy differential" in first sentence

  • Consider removing the CellML that aren't relevant to this example from the CellML code

  • As with the physical process example, can we make it so that there is a 1-to-1 correspondence between the RDF statements output from the SBML and CellML examples for energy differential annotations? That is, the structure of the resulting RDF graphs should be identical, and only the names of some of the nodes will differ based on variable names and differences in whether resources are local or point to elements in the code.

Part B:

  • Change first sentence to "While the preceding section provides a tutorial for creating annotations with the libOmexMeta library, this section demonstrates how to use libOmexMeta in conjuction with libcombine to create COMBINE archives."

  • "Its empty" should be "It's empty". Check comments in the code for readability and errors like this.

  • Why is the "Demonstration of OmexMeta Specification v1.1" section included? This appears to provide examples of the different types of annotations already covered in Part A, but the examples are slightly different. Should this be cut? If these examples need to be included, the annotations should probably correspond exactly to those in Part A, right?

M

CiaranWelsh commented 3 years ago

The second part of this issue will be dealt with in a separate issue #23

agarny commented 3 years ago

I don't believe I am involved in that paper, but watching this repo I got an email about this issue. So, FWIW:

nickerso commented 3 years ago

@agarny - maybe to be clear, this library supports the OMEX Metadata specification (http://co.mbine.org/standards/omex-metadata), not the COMBINE Archive (OMEX) specification directly. Which to even more confuse things is supported by the libCombine library :)

I'm happy with the name libOmexMeta, probably because I am well used to it now.

agarny commented 3 years ago

Argh, I had forgotten that the specification is called the OMEX Metadata specification. Ok, now, I understand why it's called libOmexMeta! 🙄🙂

I am not too big on the name, but I have to agree that libOMEXMeta looks a bit strange. By default, I would have suggested libOMEXMetadata (to stick with the name of the specification), but I agree it's too long. In the end, I would have probably gone with libAnnotation, which makes it clear (?) what the library is about.

CiaranWelsh commented 3 years ago

Closing this thread as most if not all points have been dealt with or discussed.