biosimulators / Biosimulators_utils

Utilities for building standardized command-line interfaces for biosimulation software packages
https://docs.biosimulators.org/Biosimulators_utils
MIT License
4 stars 6 forks source link

The validator rejects a sedml file that uses a model Sid reference in the source attribute of a model. #112

Closed danv61 closed 2 years ago

danv61 commented 2 years ago

The sed-ml validator tries to interpret the Sid of a model as a file name when found in the source attribute of a Model. We receive the following error: io.py", line 1539, in run raise ValueError(msg) ValueError: The SED document is invalid.

This is the relevant part of the sedml: listOfModels model id="Application0" name="Application0" language="urn:sedml:language:sbml" source="BioModel1_Application0.xml" model id="Application0_0" name="Application0 modified" language="urn:sedml:language:sbml" source="Application0"

What happens is that the 'source' attribute of the 2nd model refers to the Sid of the first model ("Application0") As per sed-ml specifications for the source of a model "To make a model accessible for the execution of a SED-ML file, the source must be specified through either an URI or a reference to an SId of an existing Model." I attached the complete sedml file: BioModel1.txt

luciansmith commented 2 years ago

I believe that the 'source' attribute has to start with a hashtag when referencing a model id in the same SED-ML file. Otherwise, it's literally impossible to distinguish from a relative filename.

jonrkarr commented 2 years ago

My interpretation is the same as Lucian. There is at least one example in the SED-ML specifications with a hash. I consider this a bug/ missing feature with VCell.

danv61 commented 2 years ago

I do not think that this interpretation is correct, a model sid may be used as a source attribute for another model as is. See the sed-ml v1.3 specification for an example named Le Loup model (L1V3 leloup-sbml.omex) (A.5.1) model2 is directly referencing model1 Time ago I wrote a java version of a sed-ml library, a solution to solve the problem was to make a list of all the models ids, then check each source attribute against this list. If there was an exact match, bingo

luciansmith commented 2 years ago

I went back to the spec, and sadly, it just doesn't say anything about internal references; it only talks about the anyURI data type as referencing external files, and recommends URLs or relative pathnames.

However, it is definitely the case that by design and by convention, any SED-ML attribute that has the type 'anyURI' must use a hashtag to reference internal-to-SED-ML IDs. Contrast this to the attributes of type 'SIdRef', which only reference SED-ML ids, and which therefore don't use the hashtag. All the examples in the spec use this scheme (particularly in section 2.2.4).

I apologize for that information not making it into the spec; we spent a lot of effort in this last version to explicitly state all the things that had been assumed, but we missed this one. I've filed an issue for the SED-ML spec at https://github.com/SED-ML/sed-ml/issues/216

danv61 commented 2 years ago

I understand now the cause of the confusion: I was looking at the examples from the sed-ml v1.3 specifications (no hash) while you were talking about the v1.4 examples (where the hash is indeed present). The version of jlibsedml used in vcell is not 1.4 compatible, and is not using hash.

jonrkarr commented 2 years ago

We corrected this confusion in the specifications this year because I also noticed this problem.