sys-bio / temp-biomodels

Temporary place for coordination of updating existing Biomodels
Creative Commons Zero v1.0 Universal
2 stars 2 forks source link

Encode scatter plots (markers, no line) in SED-ML #82

Closed jonrkarr closed 2 years ago

jonrkarr commented 3 years ago

From experience, the difference in style (between scatter and line) can cause some plots to appear to be erroneously generated.

Use COPASI files to determine which plots require scatter style, and delete them.

luciansmith commented 2 years ago

Can we use SED-ML L1v4 to add the necessary information for these plots?

jonrkarr commented 2 years ago

We don't yet support L1V4. The need for L1V4 among existing published models seems to be minimal. We aim to work L1V4 later. We think it would be more impactful to focus on making models accessible. For now, this necessitates skipping a small number of plots.

A few months ago we spent a 1+ hour trying to explore what we thought was a bug with COPASI + SED-ML. This turned out not to be a bug. This was a case where a plot looks entirely different when plotted with a different style. To users, this will appear like an error as we thought. The purpose of this is to avoid that by skipping such plots.

luciansmith commented 2 years ago

Well, the need for l1v4 for this particular issue seems to be pressing, at least. Semantic information in the style is, in fact, why the plotting information was added to L1v4 in the first place.

At any rate, I can tackle this next. It would be helpful if you knew where or how the 'scatter' information was stored in Copasi?

jonrkarr commented 2 years ago

COPASI files could probably be grepped for dash, scatter, style or similar to figure this out.

All of L1V4 would take a bunch of work. A few key things, such as plot style, is more doable.

luciansmith commented 2 years ago

Yeah, just encoding the plot style is all I was considering here, though while I'm at it, I could include a few more things for the future (line thickness, etc.) but all that can be safely ignored.

I managed to track down the encoding in the Copasi save file to:

  <ListOfPlots>
    <PlotSpecification name="nuc/cyt SMAD2 ratio" type="Plot2D" active="1" taskTypes="">
      <Parameter name="log X" type="bool" value="0"/>
      <Parameter name="log Y" type="bool" value="0"/>
      <ListOfPlotItems>
        <PlotItem name="Values[NUC/CYT SMAD2]|Time" type="Curve2D">
          <Parameter name="Line type" type="unsignedInteger" value="0"/>
          <Parameter name="Line subtype" type="unsignedInteger" value="0"/>
          <Parameter name="Line width" type="unsignedFloat" value="2"/>
          <Parameter name="Symbol subtype" type="unsignedInteger" value="0"/>
          <Parameter name="Color" type="string" value="auto"/>
          <Parameter name="Recording Activity" type="string" value="during"/>

where a "Line type" of 0 == line, 1 == points, 2 == symbols, and 3==line+symbols. But I can't figure out how to get that information out of the Copasi python interface; I've sent an email to @fbergmann asking for help.

jonrkarr commented 2 years ago

I was thinking of using lxml to extract this from the XML.

It could be retrieved with an xpath query. Something like this:

doc = lxml.etree.parse('/path/to/model.copasiml')
plots = doc.xpath('ListOfPlots')
for plot in plots:
    if plot.xpath('PlotSpecification/ListOfPlotItems/PlotItem/Parameter[@name="Line type" and value="0"]'):
        delete corresponding plot from SED-ML L1V3 or modify style in L1V4
jonrkarr commented 2 years ago

For me, I find parsing the XML more straightforward than navigating the COPASI documentation.

luciansmith commented 2 years ago

Well, 'ask Frank' is more straightforward even than that, but yes. It's not for the faint of heart.

fbergmann commented 2 years ago

It could be retrieved with an xpath query. Something like this:

doc = lxml.etree.parse('/path/to/model.copasiml')
plots = doc.xpath('ListOfPlots')
for plot in plots:
    if plot.xpath('PlotSpecification/ListOfPlotItems/PlotItem/Parameter[@name="Line type" and value="0"]'):
        delete corresponding plot from SED-ML L1V3 or modify style in L1V4

this will work fine, but i'd delete only if the value is not '0' or '3', if you want to delete only scatter plots.

I've sent lucian a snippet on how to work on it in the API.

luciansmith commented 2 years ago

OK, the latest commit fixes this by encoding all changes using the new 'style' attribute. Thanks, @fbergmann !

The vast majority of styles are 'a solid line with thickness 1' but some of them have dotted or dashed lines, and a few have markers (as per the original request of this issue). If all you care about is 'Do I plot a line for this", implementation consists of:

If you like, you can also check the line type for dotted or dashed, and some even have color. Same is true of the markers (most are 'none', some are circles).

The 'no line' sedml files are:


grep "line type" final/*/*sedml | grep none
final/BIOMD0000000643/ARPP-16_Layer1_mutualInhibitions.sedml:      <line type="none"/>
final/BIOMD0000000644/ARPP-16_Layer1and2_mutualInhibitions_PKAinhibitsMAST3.sedml:      <line type="none"/>
final/BIOMD0000000645/ARPP-16_Layer1and2and3_mutualInhibitions_PKAinhibitsMAST3_dominantNegative.sedml:      <line type="none"/>
final/BIOMD0000000673/MODEL1006230054_edited.sedml:      <line type="none"/>
final/BIOMD0000000759/denBreems2015.sedml:      <line type="none"/>
final/BIOMD0000001008/Scaramellini1997.sedml:      <line type="none"/>
final/BIOMD0000001026/Independent.sedml:      <line type="none"/>
final/BIOMD0000001026/Independent.sedml:      <line type="none"/>
final/BIOMD0000001026/Kurlovics2021.sedml:      <line type="none"/>
final/BIOMD0000001026/Kurlovics2021.sedml:      <line type="none"/>
final/BIOMD0000001040/Kurlovics2021_single.sedml:      <line type="none"/>
final/BIOMD0000001040/Kurlovics2021_single.sedml:      <line type="none"/>
final/BIOMD0000001040/Single.sedml:      <line type="none"/>
final/BIOMD0000001040/Single.sedml:      <line type="none"/>
final/BIOMD0000001044/Csikasz-Nagy2006.sedml:      <line type="none"/>
luciansmith commented 2 years ago

With https://github.com/sys-bio/temp-biomodels/commit/a3bc1e3e41d0aa6a38ba5dabac5f783659f7de50 this should be good. Waiting to see some round-trip simulations with the final versions of the sedml before closing this issue.

jonrkarr commented 2 years ago

I think this is aligned with how BioSimulations will now render styles (PRs biosimulations/biosimulations/4234, biosimulations/biosimulations/4235). This should be deployed in a few minutes at http://run.biosimulations.dev/. There's links to examples in the first PR. Note, this isn't deployed to the production version yet.

luciansmith commented 2 years ago

With https://github.com/sys-bio/temp-biomodels/commit/fb53dd1dc0345a96e2766242919523edadb77ea5 this should now all work.