pik-copan / pycopancore

reference implementation of the copan:CORE World-Earth modelling framework
http://www.pik-potsdam.de/copan
Other
45 stars 10 forks source link

Some units for Variable don't work #182

Open leander-j opened 1 year ago

leander-j commented 1 year ago

For creating rate variables in the interface, only the unit D.years**(-1) works, other time units such as days, weeks or months do not. No error message is produced and the rate is treated as 1 / D.years when used to produce the next Step or Event time, regardless of specifications in the default or run file.

mensch72 commented 1 year ago

Can you provide an example? If I try

import pycopancore.data_model.master_data_model as D
from pycopancore import Variable
import pycopancore.models.example1 as M

model = M.Model()

c = M.Culture(environmental_friendliness_learning_rate=1, awareness_update_rate=1/D.weeks)

print(c.environmental_friendliness_learning_rate, c.awareness_update_rate)

then I correctly get 1 52.177499999999995 (both Variables have unit=D.years**-1 in the interface declaration)

leander-j commented 1 year ago

I mean the interface declaration. If you choose e.g. unit=D.months**(-1) it does not work. More specifically: the step timing does not work.

Code snippets:

meeting_rate = 1 / D.months
print(1*D.years + 1/meeting_rate)

The above correctly produces 1.0833333333333333 yr, however

        Step("do meeting",
              [B.Culture.worlds.individuals.engagement_level],
              [next_meeting_time,
               do_meeting
               ]
              ),

where

    def next_meeting_time(self, t):
        return t + 1/self.meeting_rate

produces meetings at 1.0, 2.0, and so on

Note that when defining in the run script

def next_meeting_time(t):
    return t + 1/meeting_rate
print(next_meeting_time(1*D.years))

returns the correct time

mensch72 commented 1 year ago

OK I see. Here you have spotted a nasty one indeed.

What happens is this: For performance reasons, the values of Variables are stored as simple numbers in the individual entity attributes, to be interpreted in the unit that was listed in the interface. Also the current model time, t, is stored as a simple number, to be interpreted in the default time unit, years.

When a user provides a DimensionalQuantity object such as 1D.years or 1/D.weeks as the initial* value of a Variable, either as the default value in the interface or upon entity instantiation, then this dimensional quantity is converted into the Variable's unit and stored in the attribute. The same happens when setting a Variable's value later on via the Variable object's set_value method and using a DimensionalQuantity as its argument.

However, when reading the Variable's value inside a method of the entity, or when using the time argument t that the runner used in its call to a method, the unit information is not used automatically because python just sees a number at that point and does not know it is actually meant as a dimensional quantity having a certain unit. So when adding to that purely numerical value some other DimensionalQuantity as in your example, the numerical value gets interpreted in the same unit as the DimensionalQuantity, in your case as weeks rather than years, causing the bug.

When you call the method from outside the runner with a DimensionalQuantity like 1*D.years, as in your example, then the addition is between two DimensionalQuantity objects of the same dimension (time) but different units, which works since the DimensionalQuantity class then automatically does the correct conversion.

The most convenient solution would be to store all values as DimensionalQuantities rather than as simple numbers, because this would not only prevent unit mix-ups of the kind you saw, but would also prevent invalid operations with quantities of different dimension (like adding a time to a length). But it would probably slow down simulations quite much.

An alternative would be to disable the choice of a unit when declaring a Variable and only enable the specification of a Dimension, and then store all supplied values in that Dimension's default unit. This would be a breaking change in that some interface code would become invalid and have to be fixed (by replacing the unit spec by a dimension spec and adjusting any provided numerical arguments, e.g. default, lower/upper bound, quantum, etc.).

An additional safety measure would be to automatically convert a DimensionalQuantity into its dimension's standard unit when it is added to a simple number.

These two fixes would prevent unit mix-ups of the kind you saw, but would not prevent dimension mix-ups like adding a time to a length.

What do you think?

leander-j commented 1 year ago

I think it would make sense to under the hood always store each variable in a standard unit for each given dimension and have the calculations be made with pure numbers rather than DimensionalQuantities. Being able to declare a unit in the interface in which input is given still makes sense, e.g. when you have variables working on the second and the year scale and you don't want to type units with every input, or exchange with other data formats such as JSON or YAML. In this solution interfaces can stay the way they are and computation can stay as fast as it currently is.

To prevent dimension and unit mix-ups just the code could be checked once at initialization but of course not with every time step.