PySCeS / pysces

The official PySCeS project source code repository.
https://pysces.github.io
Other
34 stars 10 forks source link

Simulation rate not calculated correctly when compartments are present #79

Open jmrohwer opened 2 years ago

jmrohwer commented 2 years ago

I picked up an issue when species are in conc, and output is in conc as well, with a model that has compartments (even only a single compartment) with a volume other than 1.

Illustrated with a simple mass-action model of a closed system. The following notebook shows the issue. In a nutshell, the starting concentrations and amounts seem to be correct, but the rate constant is evaluated wrong when the volume is not 1.

As far as I could ascertain this has to do with PySCeS converts species to amounts in Simulate() even when there is only a single compartment: https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L4639-L4640 For EvalReq() the species are converted back to concentrations: https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L3603-L3607 (Note that _EvalODE() calls EvalReq()).

However, CVODE() and LSODA() both assume in this case that the ODEs are actually in amounts, with the conversion to concentrations occurring at the end after the simulation. For CVODE: https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L3909-L3915 For LSODA: https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L4052-L4054 This means that when an ODE is evaluated in such a case, the LHS is assumed to be in amounts, while the RHS is in concentrations. Obviously this can't work if the compartment volume is not 1, and will give an error for all rate parameters (kcat, k, Vmax, etc.)

Finally, maybe this has to do with the following warning (not shown by default): https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L3443-L3446 This is, however, not very practical.

I stumbled on this bug through the PySCeS Thinlayer for the PyEnzyme library used for analysing, processing and fitting enzyme kinetics data: https://github.com/EnzymeML/PyEnzyme specifically https://github.com/EnzymeML/PyEnzyme/tree/main/pyenzyme/thinlayers. The data are captured in an EnzymeML document (extension of SBML) which outputs the compartment (reaction vessel) by default with its volume (generally not equal to 1). However, for fitting purposes, all species are generally reported in concentrations, leading to an issue when fitting time courses with PySCeS.

bgoli commented 2 years ago

Interesting, thanks for the deep dive diagnoses. The problem is that if the species symbols are concentrations in the rate equations ( 'Species_In_Conc' ) then they have to be internally evaluated as such and the problem is then how to get back to amounts. The warning was meant to show this is a problem - it doesn't occur if the species symbols are amounts. It's been a while since I thought about this stuff so my apologies in advance for the following!

If we knew the location of each reaction we could do something on the right hand side (dSx/dt = Vn([S])Vcomp_vol, but I'm not sure this is generally possible. What about something like dSx/dt = Vn([S])Scomp_vol?

This should, at least, work for single compartment systems (for now I'm purposely ignoring cross-compartment reactions which have to be defined in amounts or defined with a reaction location)?

Note that this is independent of the ('Output_In_Conc') which does a post conversion of the results that are calculated as amounts. Alternatively, you can rewrite your rate equations in terms of amount and the problem goes away ;-)

Seriously though, thanks for the notebook, I'll get back to you if this is indeed the problem.

On Fri, Jun 10, 2022 at 11:30 AM Johann Rohwer @.***> wrote:

I picked up an issue when species are in conc, and output is in conc as well, with a module that has compartments (even only a single compartment) with a volume other than 1.

Illustrated with a simple mass-action model of a closed system. The following notebook https://github.com/PySCeS/pyscesdev/blob/main/johann/volumes/test_volumes.ipynb shows the issue. In a nutshell, the starting concentrations and amounts seem to be correct, but the rate constant is evaluated wrong when the volume is not 1.

As far as I could ascertain this has to do with PySCeS converts species to amounts in Simulate() even when there is only a single compartment:

https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L4639-L4640 For EvalReq() the species are converted back to concentrations:

https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L3603-L3607 (Note that _EvalODE() calls EvalReq()).

However, CVODE() and LSODA() both assume in this case that the ODEs are actually in amounts, with the conversion to concentrations occurring at the end after the simulation. For CVODE:

https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L3909-L3915 For LSODA:

https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L4052-L4054 This means that when an ODE is evaluated in such a case, the LHS is assumed to be in amounts, while the RHS is in concentrations. Obviously this can't work if the compartment volume is not 1, and will give an error for all rate parameters (kcat, k, Vmax, etc.)

Finally, maybe this has to do with the following warning (not shown by default):

https://github.com/PySCeS/pysces/blob/1831cb281817d62d3a7d225fe1acd0ba6c39cbc3/pysces/PyscesModel.py#L3443-L3446 This is, however, not very practical.

I stumbled on this bug through the PySCeS Thinlayer for the PyEnzyme library used for analysing, processing and fitting enzyme kinetics data: https://github.com/EnzymeML/PyEnzyme specifically https://github.com/EnzymeML/PyEnzyme/tree/main/pyenzyme/thinlayers. The data are captured in an EnzymeML document (extension of SBML) which outputs the compartment (reaction vessel) by default with its volume (generally not equal to 1). However, for fitting purposes, all species are generally reported in concentrations, leading to an issue when fitting time courses with PySCeS.

— Reply to this email directly, view it on GitHub https://github.com/PySCeS/pysces/issues/79, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGHUEICTCYMOYZEJ4UBLWLVOMDMBANCNFSM5YNBTTRQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jmrohwer commented 2 years ago

I have discussed this with @jhsh and am taking the liberty to post his reply here:

PySCeS evalueer dS/dt = v1 waar S in mol is en v1 dus in mol/s

Goue reël van kompartementele modellering: om mol/s snelheidsvgl te produseer, skaleer die kanoniese snelheidsvgl k1*S met die grootte van die kompartement waar die reaction events plaasvind (Cell)

vir Species_In_Conc: True

v_1 = Cell*k1*S

vir Species_In_Conc: False

v_1 = Cell*k1*(S/Cell)

Traai dit op die aangehegde notebook. Alles werk soos geadverteer.

jmrohwer commented 2 years ago

The above is implemented in a second notebook. Everything now works as it should, with one caveat:

As explained in the notebook, if both species_in_conc and output_in_conc are set to True and the rate equation is specified as it should be (v_1 = Cell*k1*S), the species plot looks fine but the rates are still outputted in amounts per time (as they should be if the kinetic law is defined as it is). This may, however, be counter-intuitive and we need to decide on how to deal with this. I see three options here:

  1. Leave the code as is, but better document this, maybe also issue a warning in PySCeS.
  2. Change the code, convert the rates to conc per time post-hoc. This again is problematic if the reaction occurs across two compartments or even in a third compartment, so which volume should the rate be scaled to? But it may be beneficial to simplify the simulation of single-compartment models in concentrations (all SBML models specify a compartment with a volume, so loading any of these into PySCeS will trigger HAS_COMPARTMENTS=True, even if it is only a single compartment). This would however still require the user to adapt the rate equation.
  3. Change the code to only convert the concentrations to amounts if there is more than one compartment. This would require an additional model keyword (HAS_MULTIPLE_COMPARTMENTS). If there is only one compartment, the "classical" rate equation (v_1 = k1*S) could continue to be used, no pre- and post-scaling is performed for species_in_conc, the output remains in conc, and the rates will be in conc per time, and the numerical value for the compartment volume is ignored in the simulation (as it has no effect on the model output in this case).

Your ideas in this regard will be appreciated.

bgoli commented 2 years ago

Thanks, I've been looking at the code and I can implement something that writes the ODE's as Cell*R1() in the code, however, the problem is that SBML derived KineticLaws should already be in that form so that complicates things (and I should have noticed that your model was written using Rate Equations and had no compartment term, sorry about that).

Thanks for noticing the rate output issue. I think this may have been taken care of in the old CVODE using the one-step algorithm? With the new CVODE, would it be possible to use dummy variables (non-fixed parameters) to track "rates in concentration" during a simulation if needed?

Some more things to consider:

  1. Better warnings, maybe specific to 1 and 2+ compartments with links to better documentation
  2. Consider a keyword that differentiates (SBML) KineticLaws, Cell*Rx(), from biochemical reactions, Rx(). This would be useful to know when to "fix" KineticLaws and, amongst other things, make exporting SBML easier for single compartment models, i.e., we could fix the exported KineticLaw (probably not done at the moment).
  3. For PySCeS models we could consider using both the species and reaction locations to semi-automate writing the ODE's for single and multiple compartments and exporting to SBML (we would need (2) to be able to differentiate SBML derived models.
jhsh commented 2 years ago

With regards to Johann's comment in the notebook: "Convert the rates to conc per time post-hoc. This again is problematic if the reaction occurs across two compartments or even in a third compartment, so which volume should the rate be scaled to?" the problem has a simple solution. In the following the equation numbers refer to Hofmeyr, J-HS (2020) Kinetic modelling of compartmentalised reaction networks. BioSystems 197, 104302

  1. With regard to the conversion of the amount-rates calculated by PySCeS for a compartmental model to concentration-rates, recall that (i) rate is fundamentally defined as the number of reaction events per time (eq.4) and not in terms of the change in the amount of a reactant or product (to which it is related by reaction stoichiometry - eq.6), and (ii) that reaction events are associated with a particular compartment, which can be different from the compartments in which the reactants and products occur. The correct procedure is therefore to divide the amount-rate by the size of the compartment where the reaction takes place, so that the rate is expressed as the change in the concentration of reaction events per time.

There are now two options:

a. Specifying "Output_In_Conc: True" implies that BOTH species and rates should be output as concentrations of species and concentrations of reaction events per time respectively.

b. Since in principle the units of a correctly-defined rate have nothing to do with whether reactants and products are in amounts or concentrations in its rate equation, one may rather want to distinguish between the units in which output-concentrations and output-rates are expressed. This can be done by adding a flag that specifies the form in which rates are output. Something like

Rate_Output_In_Conc_Per_Time (with True as default?)

Here "Output_In_Conc" could however be construed as referring to both species and rates). Therefore "Species_Output_In_Conc" would be better than "Output_In_Conc". Too late for that now? PySCeS could easily be made back-compatible with models that still use "Output_In_Conc".

  1. ODEs in compartmental models are in amount/time and it may be that the modeller requires access to the ODE values (dn_x/dt). I may be mistaken, but I don't see a data_sim method that provides access to this, but I suppose it would be easy to implement. If so, then one could output the value of a dx/dt ODE by dividing dn_x/dt by the compartment size in which X occurs (not necessarily a volume!).

  2. What if one wants to output the rate at which a particular species changes with time due to a particular reaction? For the amount of a species X the relevant equation is eq.6, which, when divided by the compartment size in which X occurs, will give the rate in concentration/time.

For the moment, 2 and 3 can be implemented by the modeller with assignment rules.

We still have to consider Brett's suggestions, but this is all for the moment.

jmrohwer commented 2 years ago

Hi @bgoli

Coming back to this:

Thanks, I've been looking at the code and I can implement something that writes the ODE's as Cell*R1() in the code, however, the problem is that SBML derived KineticLaws should already be in that form so that complicates things (and I should have noticed that your model was written using Rate Equations and had no compartment term, sorry about that).

You mention that SBML derived KineticLaws should already be in the Cell*R1() form. Can you clarify what exactly this means. Should the modeller specify the rate law in this form himself when he is writing SBML? Or does libSBML change the rate law automatically? Or is it just that the rate law is specified like this in the SBML test suite, but for other models it is up to the modeller to do this correctly?

The reason I'm asking this is that what prompted all of this actually started out from SBML. It is an EnzymeML document (which is basically an SBML document with custom annotations for storing the kinetic details, i.e. EnzymeML specific stuff). But the rate law is entered by hand when creating this document. And most enzyme kineticists of course work in rate equations in concentrations.

Thanks for noticing the rate output issue. I think this may have been taken care of in the old CVODE using the one-step algorithm? With the new CVODE, would it be possible to use dummy variables (non-fixed parameters) to track "rates in concentration" during a simulation if needed?

Yes, as also mentioned in Jannie's comments above, I've re-implemented the ability to track arbitrary model attributes with assignment rules (see release 1.0.2).

bgoli commented 2 years ago

You mention that SBML derived KineticLaws should already be in the Cell*R1() form. Can you clarify what exactly this means. Should the modeller specify the rate law in this form himself when he is writing SBML? Or does libSBML change the rate law automatically? Or is it just that the rate law is specified like this in the SBML test suite, but for other models it is up to the modeller to do this correctly?

Yes to the first question - this is the modellers responsibility. As I understand it SBML 3 defines it's KineticLaw as something that returns amounts per time (Sizerates_in_concentration) so one can't use a Biochemical Rate Equation directly in SBML. It is well hidden but this is a bit from section 4.11.7 on Page 76 of the L3V2 core spec, Mathematical interpretation of SBML reactions and kinetic laws*.

Constructing rate-of-change equations for the species

A consequence of the approach to "kinetic laws" discussed in the previous section is this: when constructing equations describing the time-rates of change of different species de ned by an SBML model, the equations are assumed to be in terms of time-rates of changes to amounts, not concentrations (or more generally densities, i.e., amount per size of compartment). A kinetic law does not describe how often a reaction would take place in a compartment of unit size, but rather how often it takes place (per time unit) given the actual size of the compartment. The dimension of the kinetic law is therefore number of reaction events per time.

More on this topic on page 128 of the spec.

You may want to check this for EnzymeML so that it is using valid SBML, if it is a case of a 90% of the time it is a single compartment then the tool can easily do the modification on SBML export (multiply rate law by compartment size). In my other comment the question I raised, was, whether this couldn't be further generalised into a (semi-?)automated solution where one uses the Reaction location to determine the compartment size that multiplies the rate law?

jmrohwer commented 2 years ago

Interesting, just as this reply came through I had started to have another detailed look at the SBML spec and came across exactly the passage you are referring to.

Yes to the first question - this is the modellers responsibility. As I understand it SBML 3 defines it's KineticLaw as something that returns amounts per time (Size*rates_in_concentration) so one can't use a Biochemical Rate Equation directly in SBML. It is well hidden but this is a bit from section 4.11.7 on Page 76 of the L3V2 core spec, Mathematical interpretation of SBML reactions and kinetic laws.

Indeed, to quote the preceding paragraph from the spec, as it states this unambiguously:

In multicompartmental models, to be able to specify a rate law only once and then use it unmodified in equations for different species, the rate law needs to be expressed in terms of an intensive property, that is, species quantity/time, rather than the extensive property typically used, species quantity/size/time. As a result, modelers and software tools in general cannot insert traditional textbook rate laws unmodified into the math element of a KineticLaw. The unusual term “kinetic law” was chosen to alert users to this difference.

Quoting @bgoli:

You may want to check this for EnzymeML so that it is using valid SBML, if it is a case of a 90% of the time it is a single compartment then the tool can easily do the modification on SBML export (multiply rate law by compartment size).

Indeed, 99% of the time I can guarantee you that EnzymeML deals with single compartment models. Their use case is for experimental enzyme kinetics or biocatalysis experiments, which are done in a single reaction "vessel" (Vessel is actually an EnzymeML attribute :smile:). And I am now convinced that the current implementation in PyEnzyme does not produce valid SBML in this regard.

In my other comment the question I raised, was, whether this couldn't be further generalised into a (semi-?)automated solution where one uses the Reaction location to determine the compartment size that multiplies the rate law?

I guess one could do that but automated solutions are always dangerous because they may have unintended consequences in the background and/or the modeller doesn't really know what they're doing. But I am convinced now that there is actually a need for a single compartment automated solution, where there is no ambiguity. Refer to my comment above re. HAS_MULTIPLE_COMPARTMENTS. If we want to take SBML seriously as a tool for model standardisation and exchange, then we should adhere to it, so even single-compartment models defined in SBML should have their rate laws in amounts per time. Some automated conversion would thus be very useful since most if us are used to thinking about biochemical reaction rates in concentration changes per time.

As soon as we are dealing with multiple compartments, I'd stay away from automated conversions. This has the potential to become a black box too quickly, and modellers should know what they are doing if they venture here!

jmrohwer commented 2 years ago

Quoting @jhsh:

There are now two options:

a. Specifying "Output_In_Conc: True" implies that BOTH species and rates should be output as concentrations of species and concentrations of reaction events per time respectively.

b. Since in principle the units of a correctly-defined rate have nothing to do with whether reactants and products are in amounts or concentrations in its rate equation, one may rather want to distinguish between the units in which output-concentrations and output-rates are expressed. This can be done by adding a flag that specifies the form in which rates are output. Something like

Rate_Output_In_Conc_Per_Time (with True as default?)

Here "Output_In_Conc" could however be construed as referring to both species and rates). Therefore "Species_Output_In_Conc" would be better than "Output_In_Conc". Too late for that now? PySCeS could easily be made back-compatible with models that still use "Output_In_Conc".

In principle this is all true, but refer to my reply to @bgoli above. I am not sure if its possible to implement a solid, foolproof, consistent, automated Rate_Output_in_Conc_Per_Time in the case of multiple compartments. Consider the example 7.8 on p. 130 of the SBML spec. Here the transport reaction is not even explicitly assigned to a "compartment" such as a membrane, but is just defined to transport a substance between two compartments (cytosol and nucleus):

<listOfSpecies>
    <species id="Y1n" compartment="nucleus" initialAmount="1" constant="false"
                      boundaryCondition="false" hasOnlySubstanceUnits="false"/>
    <species id="Y1c" compartment="cytoplasm"  initialAmount="0" constant="false"
                      boundaryCondition="false" hasOnlySubstanceUnits="false"/>
</listOfSpecies>

<reaction id="transport" reversible="true">
    <listOfReactants>
        <speciesReference species="Y1n" stoichiometry="1" constant="true"/>
    </listOfReactants>
    <listOfProducts>
        <speciesReference species="Y1c" stoichiometry="1" constant="true"/>
    </listOfProducts>
    <kineticLaw>
        <math xmlns="http://www.w3.org/1998/Math/MathML">
            <apply>
            <times/>
            <ci>cytoplasm</ci>
            <ci>KT</ci>
            <apply>
            <minus/>
            <ci>Y1n</ci>
            <ci>Y1c</ci>
            </apply>
            </apply>
        </math>
    </kineticLaw>
</reaction>

I would rather have an automated conversion of the reaction rates (see above) for single compartment models with Output_In_Conc=True, and for multi-compartment models issue a warning that reaction rates are in amounts per time (similar to what is currently done by the rather obscure warning on flow constants (see above); this currently however is only shown when warnings are enabled).

We should also keep in mind that the current Species_In_Conc PySCeS keyword is directly modelled on the SBML attribute hasOnlySubstanceUnits="false" and is used in the translation from SBML<->PSC. It determines if the species terms in the kinetic laws should be interpreted as amounts or concs.

2. ODEs in compartmental models are in amount/time and it may be that the modeller requires access to the ODE values (dn_x/dt). I may be mistaken, but I don't see a data_sim method that provides access to this, but I suppose it would be easy to implement. If so, then one could output the value of a dx/dt ODE by dividing dn_x/dt by the compartment size in which X occurs (not necessarily a volume!).

PySCeS has a hidden method mod._EvalODE which does this. The method calls mod._EvalREq, which does the conversion from concentrations to amounts if necessary (compartmental models). I don't think there would be a use case for this since we are also not outputting these values and keeping them after a simulation. I would think access to the reaction rates is much more important (although the ODEs are of course easier to scale as each species is uniquely assigned to a compartment).

3. What if one wants to output the rate at which a particular species changes with time due to a particular reaction? For the amount of a species X the relevant equation is eq.6, which, when divided by the compartment size in which X occurs, will give the rate in concentration/time.

This is very similar to your point 2, in fact it is just one term of the ODE for X.

For the moment, 2 and 3 can be implemented by the modeller with assignment rules.

And indeed it is now possible to track assignment rules again during simulations (release 1.0.2), so I would not build additional functionality here.

bgoli commented 2 years ago

@jmrohwer being SBML compatible (export/import) does not mean limiting ourselves to it or the way it has evolved :-) Let me put it this way:

If we know the location (compartment) of the Reaction and the individual Species how far can we get with automatically generating amount/time ODE's with concentration based rates.

From @jhsh Point 1 above (hopefully :-)) a reaction must be located in a compartment, and, then that compartment can then be used to calculate the amount of enzyme per unit volume or area etc. If this is true then the only "problem" is if you have badly designed models where enzymes float on the interface between compartments (likely SBML example models that make no sense). On the other hand, for PySCeS models it should be easier to use biochemical Rate Equations in multicompartment models - assuming we require all Reactions/Species to have a location in a compartment.

SBML models generally do not have the reaction compartment defined, although I think the attribute exists in L3, so we need to treat them differently ... as always.

(Edit. Note, one thing we should keep in mind when using compartment sizes for single compartment models is scaling, we will have to detect very small compartment sizes and autoscale the compartment sizes pre/post ODE evaluation simulation - no matter which direction we go.)

jhsh commented 2 years ago

Quoting @jmrohwer:

"Indeed, to quote the preceding paragraph from the spec, as it states this unambiguously:

In multicompartmental models, to be able to specify a rate law only once and then use it unmodified in
equations for different species, the rate law needs to be expressed in terms of an intensive property, that is,
species quantity/time, rather than the extensive property typically used, species quantity/size/time. As a
result, modelers and software tools in general cannot insert traditional textbook rate laws unmodified into
the math element of a KineticLaw. The unusual term “kinetic law” was chosen to alert users to this difference."

This is, as I have often mentioned before, the wrong way round: species quantity/time is an extensive property that is proportional to the size of the compartment in which the reaction events occur, while species quantity/size/time is an intensive property that does not depend on the size of the compartment. I make this clear in my paper. I do wish someone would change this in the SBML documentation.

Quoting @bgoli:

"SBML models generally do not have the reaction compartment defined, although I think the attribute exists in L3"

Such an attribute is crucial in compartmental models and should be there. What happens when "v1@Cell:" in a PySCeS input file is translated to SBML?

jmrohwer commented 2 years ago

Quoting @jhsh

Quoting @bgoli:

"SBML models generally do not have the reaction compartment defined, although I think the attribute exists in L3"

Such an attribute is crucial in compartmental models and should be there. What happens when "v1@Cell:" in a PySCeS input file is translated to SBML?

The attribute exists in L3 but it is optional (i.e. not required). So when a PySCeS input file as above is translated to SBML, then the attribute is included. In the example I cited from the SBML spec above, the attribute is missing though.

I see that in the example above (this post) the kinetic law is multiplied by cytoplasm to convert it to amount. While technically correct in terms of the units, this is rather arbitrary and could just as well have been scaled by nucleus, which would have given a different answer. Because the compartment of the reaction is not defined, it is impossible in this case to uniquely specify a kinetic law if species are given in concentrations.

bgoli commented 2 years ago

@jmrohwer we currently only export L2 so I don't think that attribute exists. What I would do with L3 is set the Reaction compartment attribute to the correct compartment and then multiply the RateEquation with the PySCeS defined compartment size to create the associated KineticLaw for export.

SBML export is not the problem, import is. If the Reaction compartment is not defined, I can't easily extract the correct compartment term from the KineticLaw so I need to treat the SBML KineticLaw as "something" that returns rates in amount/time (as we do now). PySCeS could, if we can get it to work, deal with multi-compartment RateEquation models in a slightly more sophisticated way than SBML, whilst still being SBML3 compatible

@jhsh has pointed out the issues with the SBML example which, as you see, is dodgy and the intrinsic/extrinsic issue which is just wrong and is registered as an SBML specification issue.

jmrohwer commented 2 years ago

@jmrohwer we currently only export L2 so I don't think that attribute exists. What I would do with L3 is set the Reaction compartment attribute to the correct compartment and then multiply the RateEquation with the PySCeS defined compartment size to create the associated KineticLaw for export.

Well, if the compartment is defined for the reaction in SBML, this gets correctly translated to PSC and I get v1@Cell or whatever. That's what triggered the whole issue in the first place. So SBML->PSC works (even though PySCeS spits out a warning about only supporting L2). I must say I haven't tested export (PSC->SBML or mod->SBML).

SBML export is not the problem, import is. If the Reaction compartment is not defined, I can't easily extract the correct compartment term from the KineticLaw so I need to treat the SBML KineticLaw as "something" that returns rates in amount/time (as we do now). PySCeS could, if we can get it to work, deal with multi-compartment RateEquation models in a slightly more sophisticated way than SBML, whilst still being SBML3 compatible

Okay I get what you're saying here.

jmrohwer commented 2 years ago

FWIW I've just checked that during the SBML->PSC conversion PySCeS assigns the reaction to a compartment (i.e. v1@Cell) even if the compartment is not listed as an attribute of the reaction in the SBML. When exporting the SBML again, the compartment attribute is also not included in the reaction. Round-tripping (PSC->SBML->PSC) gives identical PSC files (other than the generated date in the header).

This is all only tested with single compartment models. Not sure what would happen in the case of multi-compartment models.

bgoli commented 2 years ago

Ja so it looks like if the model was from SBML, or, a PySCeS model that had rate laws in amount/time it would be correct SBML. Whereas if a PySCeS model was defined with no compartments the SBML KineticLaw would technically be incorrect, although practically the model would be exported with a compartment of size 1.0 and still work for that value!

This is a good time to discuss this as SBML, Python and libSBML have evolved a lot since this code was written and now that people are actually using SBML we can't just say "just use PSC" anymore. Also the person that wrote this stuff now knows a bit more about libSBML and Python :-)

So some things to consider (assume we are eventually moving to L3).

Maybe we should do something like: 1) deciding how we would like to work with compartments in PySCeS (and PSC) 2) deal with single compartments: -- make single compartment SBML import/export rock solid -- ensuring PSC export works consistently and is documented as such. 3) deal with multicompartment SBML import/export

We can do this in multiple releases as we fix essential stuff first etc.

jmrohwer commented 2 years ago

Sounds like a plan :grinning:

On the topic of moving to L3: as an aside, I've put in an Honours project for a Bioinformatics student to work on updating the events framework in PySCeS to be fully L3 compliant (handling of simultaneous events with priority). However, I don't know yet whether there will be any takers.

bgoli commented 2 years ago

Sounds good. First issue (a lot of which has been discussed here so we can continue here or split). First attempt at some use cases, please feel free to add/extend/modify:

Defining and interpreting no-, single- and multi-compartment PySCeS models (in PSC).

Case 1: A vanilla PSC model with no compartments defined

Case 2: A PSC model defined with a single compartment

Case 3: A PSC model defined with a multiple compartments

For each use case we should think about things like

1) What form can (should?) the rate law be written as (REq or KL or both) and which model properties are required in each case (if different)? 2) What model properties or keywords will be required (in general)? 3) What needs to be done, automagically, to translate this model into amount/time ODE's? 4) ...