modelica / ModelicaSpecification

Specification of the Modelica Language
https://specification.modelica.org
Creative Commons Attribution Share Alike 4.0 International
104 stars 40 forks source link

Type checking versus unit checking #3382

Open gwr69 opened 1 year ago

gwr69 commented 1 year ago

Cursory (don't beat me) reading of some recent discussions(e.g., #3259, #3381) finds me wondering about whether the unit attribute is to become the "one attribute to rule them all". As the author of the Business Simulation library—a system dynamics library making use of acausal connectors and (notoriously) unconventional types like Amount and AmountRate—I am coming more from the "cybernetic" side of "cyber-physical" modeling. Nonetheless—or even exactly because of this more abstract nature—verification(e.g., will the model result in a solvable set of equations) and validation(e.g., is it meaningful with regard to the intended purpose) of models are important.

In programming, we typically use type to catch errors (statically or dynamically inferred by compilers). And indeed I adapted this principle by having modellers make type choices for causal (mostly output) connectors, which then are used within a component for variables and parameters (the nature of abstract libraries makes it much harder to make these definite choices when designing components).

But setting an explicit quantity attribute for causal connectors has just been called "inconvenient" and "unusual" (#3381) by Hans Olsson. I would argue on the contrary: It could be very convenient to set the type using a replaceable type and the beauty of it is that next to the quantity and unit attribute one can also set displayUnit or min/max in a single choice as to nudge users not to change the unit when what they need is a different displayUnit (e.g, have a derived type `Time_months).

I note that the predefined types Boolean, Integer, String, and even Enumeration all carry the quantity attribute. Maybe it should help us to distinguish fundamental differences even when a unit-system is not applicable:

  String str1(quantity = "Message") = "This is likely a unit error.";
  String str2(quantity = "Documentation") = "This is a basic converter class.";
  String str3(quantity = "Label") = "SuperClass";

  type Color = enumeration(red, yellow, green);

  parameter Color condition(quantity = "Reliability") = Color.red;
  Color signal(quantity = "TrafficSignal"); 

Make no mistake: It is absolutely essential that unit errors are caught (e.g., as to not mix mm and m in Quentins example in #3381). But to me quantity is more likely to catch fundamental equation errors (e.g., there is no meaningful way to propagate magnitudes from one side of an equation to the other), while unit mismatch has a higher likeliness to be an error in the order of magnitudes (e.g., numerical error only).

I had believed that this fundamental quality of quantity was the reason to make it—not the unit attribute—a restriction for connections in the section 9.3 of the specs:

  • In a connection set all variables having non-empty quantity attribute must have the same quantity attribute.

Hans Olssons post suggests that setting the quantity attribute for causal connectors is "inconvenient" to do and that it "cannot even be inferred". But that is exactly what makes quantity so valuable in catching fundamental errors and this even applies to causal connections because if you want to measure Time you should not connect to a sensor that measures Length or Speed.

I would like to stipulate the discussion here with the following:

  1. While unit checking is important, make setting quantity and especially type more convenient for modelers in order to catch fundamental errors in equation formulation and to make it convenient to set multiple attributes at once.

  2. A model may have numerical errors, but still give a solvable set of equations, if quantities match up, "mere" unit errors should give a warning, but not necessarily prevent models from being simulated (Note, that many of the unit error examples given in #3381 are in fact (dimensional) quantity errors mixing say m and s).

henrikt-ma commented 1 year ago

I would like to stipulate the discussion here with the following:

  1. While unit checking is important, make setting quantity and especially type more convenient for modelers in order to catch fundamental errors in equation formulation and to make it convenient to set multiple attributes at once.

To me, this sounds like either a tooling issue, or a problem to be solved by proper library design using existing language features. Did you see something also missing on the language side?

  1. A model may have numerical errors, but still give a solvable set of equations, if quantities match up, "mere" unit errors should give a warning, but not necessarily prevent models from being simulated (Note, that many of the unit error examples given in Infering units for empty unit variable. #3381 are in fact (dimensional) quantity errors mixing say m and s).

To me, numerical errors in a successful simulation are the worst kind of errors. It ruins trust in the tools we use and the models we develop. For this reason, I strongly believe that unit errors should not be allowed by the specification. As @qlambert-pro pointed out, tools may give an option to bypass the specification and ignore unit errors, but in my opinion models with these errors should have no place in the world of valid Modelica models.

HansOlsson commented 1 year ago

I would like to stipulate the discussion here with the following:

  1. While unit checking is important, make setting quantity and especially type more convenient for modelers in order to catch fundamental errors in equation formulation and to make it convenient to set multiple attributes at once.

To me, this sounds like either a tooling issue, or a problem to be solved by proper library design using existing language features. Did you see something also missing on the language side?

I can see that the library design is somewhat missing in that we have Modelica.Units.SI for normal variables, but nothing similar for causal connectors, and we could imagine some way of extending the language to make declaring e.g., a Length-connector easier.

However, obviously tools can help without any design changes, e.g., Dymola 2020 introduced "Set Unit" for connectors and connections.

  1. A model may have numerical errors, but still give a solvable set of equations, if quantities match up, "mere" unit errors should give a warning, but not necessarily prevent models from being simulated (Note, that many of the unit error examples given in Infering units for empty unit variable. #3381 are in fact (dimensional) quantity errors mixing say m and s).

To me, numerical errors in a successful simulation are the worst kind of errors. It ruins trust in the tools we use and the models we develop. For this reason, I strongly believe that unit errors should not be allowed by the specification. As @qlambert-pro pointed out, tools may give an option to bypass the specification and ignore unit errors, but in my opinion models with these errors should have no place in the world of valid Modelica models.

I don't think anyone wants to ignore actual unit errors. However, in several cases it may be good to have the possibility to disable the detailed unit-checking for an equation (or entire model) - as discussed in https://github.com/modelica/ModelicaStandardLibrary/issues/4097

Note in particular:

So, I see some reasons for giving users the possibility to disable unit-checking:

One might even see an additional benefit of not unit-casting correlations, since a major risk is instead that the numbers are just a bit incorrect as in https://github.com/modelica/ModelicaStandardLibrary/issues/4097 and thus easily finding the use of such models has a benefit.

gwr69 commented 1 year ago

I would like to stipulate the discussion here with the following:

  1. While unit checking is important, make setting quantity and especially type more convenient for modelers in order to catch fundamental errors in equation formulation and to make it convenient to set multiple attributes at once.

To me, this sounds like either a tooling issue, or a problem to be solved by proper library design using existing language features. Did you see something also missing on the language side?

Yes, a lot of this may be a "tooling" issue. System Modeler, for example, allows the user to address type for a causal connector class:

image

I may simply have missed how to fit "my list of preferred types" into the Non-SI unit list in that menu. Note, that the Other field completely lacks the convenience of the SI unit and Non-SI unit fields, which are drop down menu supported. Since annotations are not inherited, I found it hard to operate with say the choices annotation for selecting type to give an example.

  1. A model may have numerical errors, but still give a solvable set of equations, if quantities match up, "mere" unit errors should give a warning, but not necessarily prevent models from being simulated (Note, that many of the unit error examples given in Infering units for empty unit variable. #3381 are in fact (dimensional) quantity errors mixing say m and s).

To me, numerical errors in a successful simulation are the worst kind of errors. It ruins trust in the tools we use and the models we develop. For this reason, I strongly believe that unit errors should not be allowed by the specification. As @qlambert-pro pointed out, tools may give an option to bypass the specification and ignore unit errors, but in my opinion models with these errors should have no place in the world of valid Modelica models.

I am quite thankful for the examples given by HansOlsson above and it irritates me to read your replay as if I had demanded to abandon unit checks—I am all in for unit checks and in fact I made sure users are nudged to select appropriate types even on a component level as to make finding unit errors easier, while it had not yet been implemented in my tool of choice!

What I was trying to argue for was to come up with a framework that is convenient and easy to use. To me working with something like replaceable type achieved just that in a legal way, i.e., according to Modelica specs. Assigning a type in a convenient way makes it easy to assign many attributes at once and I felt that quantity is at least as important as unit. As a library designer, I wanted to make it easy for a user to think in dimensions for a variable, which in many cases will have canonical units so that displayUnit should be changeḍ—not unit.

Dimensional analysis extends the concept of unit checking—it does not bypass it...

gwr69 commented 1 year ago

Just as a loose idea: Could there be sections within a model that are encapsulated with regard to unit checking? Ultimately, at the boundary of a model we already deal with information sources and we will "paint numbers" that is we will enter or read in say a Real valued magnitude and assign a quantity and matching unit. So, at the cybernetic side of models there may be whole model parts where unit and quantity are simply attributes for some output that "we" need to (or at least have reasons to) trust. Will a neural network based controller, which is fed physical data to come up with say a vector y that is used as some control input u elsewhere ever pass unit checking? (This may simple be another way of describing the good old practice to make such input dimensionless as a first step to make it exempt from unit checking.)

henrikt-ma commented 1 year ago

Make no mistake: It is absolutely essential that unit errors are caught (e.g., as to not mix mm and m in Quentins example in #3381). But to me quantity is more likely to catch fundamental equation errors (e.g., there is no meaningful way to propagate magnitudes from one side of an equation to the other), while unit mismatch has a higher likeliness to be an error in the order of magnitudes (e.g., numerical error only).

I'm of the opposite opinion; I find unit errors to be fundamental, whereas quantity errors (given consistent units) may be of a softer kind. For example, this quantity inconsistency doesn't necessarily look like a fundamental modeling error to me:

Real x(unit = "J", quantity = "Work") = 1;
Real y(unit = "J", quantity = "Energy") = x;

Note that the two are defined side by side in Modelica.SIunits, so it's not that the quantities have been defined independently in different libraries:

  type Work = Real(final quantity = "Work", final unit = "J");
  type Energy = Real(final quantity = "Energy", final unit = "J");

I had believed that this fundamental quality of quantity was the reason to make it—not the unit attribute—a restriction for connections in the section 9.3 of the specs:

  • In a connection set all variables having non-empty quantity attribute must have the same quantity attribute.

My guess is that this was simply the easy part to define for a connection set. I believe it is generally expected that tools will also reject connection sets where units do not agree, even though it isn't stated in the specification (probably because defining unit checking is complicated).

gwr69 commented 1 year ago

Make no mistake: It is absolutely essential that unit errors are caught (e.g., as to not mix mm and m in Quentins example in #3381). But to me quantity is more likely to catch fundamental equation errors (e.g., there is no meaningful way to propagate magnitudes from one side of an equation to the other), while unit mismatch has a higher likeliness to be an error in the order of magnitudes (e.g., numerical error only).

I'm of the opposite opinion; I find unit errors to be fundamental, whereas quantity errors (given consistent units) may be of a softer kind. For example, this quantity inconsistency doesn't necessarily look like a fundamental modeling error to me:

Real x(unit = "J", quantity = "Work") = 1;
Real y(unit = "J", quantity = "Energy") = x;

Note that the two are defined side by side in Modelica.SIunits, so it's not that the quantities have been defined independently in different libraries:

  type Work = Real(final quantity = "Work", final unit = "J");
  type Energy = Real(final quantity = "Energy", final unit = "J");

I had believed that this fundamental quality of quantity was the reason to make it—not the unit attribute—a restriction for connections in the section 9.3 of the specs:

  • In a connection set all variables having non-empty quantity attribute must have the same quantity attribute.

My guess is that this was simply the easy part to define for a connection set. I believe it is generally expected that tools will also reject connection sets where units do not agree, even though it isn't stated in the specification (probably because defining unit checking is complicated).

I agree as ultimately units can be broken down to elementary base units in SI (dimensional analysis). I had something like torque as opposed to work/energy in mind, which have compatible or even identical units (N.m) but should be treated differently in equations.

EDIT: The examples torque, work, energy are instructive as the base unit would be different with N.m being displayUnit for work and energy and base unit for torque. Interestingly, the current specs would prevent connecting work and energy connectors as incompatible. So, would unit compatibility ultimately override quantity mismatch?

henrikt-ma commented 1 year ago

I agree as ultimately units can be broken down to elementary base units in SI (dimensional analysis). I had something like torque as opposed to work/energy in mind, which have compatible or even identical units (N.m) but should be treated differently in equations.

Sure, it would be cool if such errors could also be detected. It's just not on the agenda, probably mostly because we don't have a system that allows us to determine whether a force multiplied by a length is a torque, a work, or something else (possibly library-defined quantity).

EDIT: The examples torque, work, energy are instructive as the base unit would be different with N.m being displayUnit for work and energy and base unit for torque. Interestingly, the current specs would prevent connecting work and energy connectors as incompatible. So, would unit compatibility ultimately override quantity mismatch?

No, or at least I haven't seen any indication of anyone wanting to tear up the quantity rule for connection sets. I'd expect a conversion from "Work" to "Energy" to be needed before allowing them in the same connect set. The conversion could have the form a declaration equation (as in the example I gave above).

gwr69 commented 1 year ago

Thinking about this a bit more, I believe I should find myself corrected with regard to quantity for (at least) causal connectors. From a philosophical point of view the quantity for all causal connectors may be seen to be

quantity = "Information"

A causal connector by its very nature is not a physical quantity but a cybernetic one, e.g., a "painted" number. Is it worth thinking about having different connection set rules for causal as opposed to acausal connectors?

gwr69 commented 1 year ago

Excellent comment to be linked from here: https://github.com/modelica/ModelicaSpecification/issues/2127#issuecomment-1271723893