SED-ML / sed-ml

Simulation Experiment Description Markup Language (SED-ML)
http://sed-ml.org
5 stars 2 forks source link

Clarify how plots are intended to be rendered, particularly the default that styles override #211

Open jonrkarr opened 2 years ago

jonrkarr commented 2 years ago

The specifications discuss how styles should override the default plot style. However, I don't think the default style is clearly articulated. Sections 2.2.18.1, 2.2.18.2, and 2.2.18.3 suggest the default " ... can be anything". 2.2.12.5 suggests "... any style may be used", suggesting that one of the instances of Style should be chosen.

By suggesting that markers need to explicitly be turned off, 2.2.12.6 suggests that the default style for a curve of type points should be for lines and markers to both be drawn ("Similarly, to display a line only with no markers the Marker from its Style is set to have a type of 'none'"). However, the examples diverge from this by not showing markers.

It would be helpful to clarify how plots are intended to be rendered. For example, for a curve of type points with undefined style

luciansmith commented 2 years ago

My understanding has always been that there is no default in the specification itself, and instead that simulators will have their own defaults they use. This is a general statement for the entirety of SED-ML: if no KiSAO term for a particular solver is given for a steady state simulation, the simulator uses whatever it feels is best. If no algorithm parameter is defined for absolute tolerance, then that parameter is left undefined by the file, and the simulator must use its own default. No absolute tolerance is defined in the specification as being the 'default value' if left unspecified.

The same is true for styles. Every L1v3 document has no styles defined it it, meaning that the author simply provided no information about that. One simulator might use lines and another points; one might color everything red and another black. No decision made in the SED-ML document meant that any decisions that needed to be made must be made on in the moment.

There was a big push when SBML L3 came out to remove any default values from the spec, on the theory that this pushed all the information into the document itself, instead of having to look things up in the spec. This turned out to have a few drawbacks here and there, but on the whole, I feel like it's worked out well. Definitely in the case of drawing styles, I don't want people to have to look in the spec to discover that they're supposed to draw everything in black, or without markers, or anything else. All the information should be present in the document, and anything not in the document should mean that the document has nothing to say on the subject.

One thing SBML did in service of this was to eliminate optional Boolean attributes, on the theory that it is easy to assume that if a Boolean attribute is missing, that means it has a value of 'false'. I feel like the same might be true here: it's easy to assume that if there is no 'marker' child of a Style, that means it has a value of 'none'. This is exacerbated by the fact that this is true of the 'fill' object: we forgot to put in a way to define 'no fill' explicitly, meaning that the only way to define 'no fill' is to not give a Style a Fill child. That's bad and inconsistent design: we should have noticed that everywhere else, 'missing' means' undefined', not 'false/none'.

One way to overcome this would be to make 'line', 'marker', and 'fill' required children of Style, and state that you must set them to 'none' if you don't want them present. This would avoid the "What's the default?" problem by never having any defaults.

As it stands, though, you are correct that we need more detail in the spec about what is assumed, and what 'missing' means. We should emphasize that 'missing' always means 'undefined', unless we messed up like we did with 'fill'.

jonrkarr commented 2 years ago

I think its fine to continue to say that simulation tools can choose default colors (and maybe line thickness). I think this fair to say that the default line type should be solid (I think specifying default thickness 1 probably isn't controversial either).

The more important points are something like this:

luciansmith commented 2 years ago

Yes, that's exactly what I want to avoid. All the information should be in the document, and not hidden in the spec.

Arguably, a better default would be to print markers and not lines. This is what matplotlib does. This would avoid issues like https://github.com/sys-bio/temp-biomodels/issues/82 where plots intended to be scatter plots looked incorrect when plotted with lines instead. But again, I think that argument doesn't belong in SED-ML. SED-ML either says something, or it doesn't, and I don't think it's wise, in the long run, to imply anything through the use of defaults.

luciansmith commented 2 years ago

(I would also be fine with making line and marker required children of Style, instead of optional. Clearly the semantics are muddled here, and that would solve things. You still have to deal with the 'style' attribute being optional on curves, but at least that's a bit more obvious.)

jonrkarr commented 2 years ago

Completely agree with not hiding things in the specifications. That's even better.

But, virtually all existing SED-ML files are encoded with the current specification (actually earlier). Until the specification is changed, the point of this issue was to try to clarify the intention of current specification.

luciansmith commented 2 years ago

I would be fine with some sort of 'best practices' document we could release now and perhaps incorporate into the spec later? It could say something like 'All L1v3 documents and many L1v4 documents do not define a style for the curves, leaving it to the SED-ML interpreter to decide what to do. We recommend plotting 2D output as a series of markers if not otherwise specified, as this approach works visually for both scatter plots and linear data. In L1v4 documents that define a style with a 'line' child but no 'marker' child, no markers typically need be displayed, though for sparse data they may be helpful. When creating SED-ML documents, it is strongly recommended to include both a 'line' and a 'marker' child of any defined 'style', setting their type to 'none' when they are to be omitted.'

jonrkarr commented 2 years ago

A separate best practices document would be helpful. Alternatively, it can be incorporated into the same document with callouts (e.g., colored boxes that highlight these notes).