<true> and <false> in MathML is permitted or not?

kerimoyle commented 4 years ago

@MichaelClerx said:

Jan 28, 2019 Not sure why we have these! Since a variable can never be true or false, it seems all you can do with these is write things like "if((x == 1) == true) then..."

@nickerso said:

9:08 AM Dec 17 we need to think more about how to handle logical values, e.g., a = x && y

@hsorby said:

1:21 PM Dec 17 Do we have to process this further and say it is 1.0 i.e. that it is not a logical value?

... and the final verdict is ... ? Do the <true> and <false> tags stay as allowable values in MathML or not?

MichaelClerx commented 4 years ago

This whole thing seems unresolved, no?

We probably need to introduce at least a number type and a boolean type (whatever CellML 1 says, boolean is most definitely not a unit), and then say that variables' values must be number types, while certain operators only take boolean types.

I could see true and false having some use in debugging e.g. temporarily replacing some condition with <true>. But other than that...

kerimoyle commented 4 years ago

I reckon it comes under the heading of "you can if you want, but we don't fully understand why you'd want to". I don't see that the spec needs to exclude things on that basis - people should be able to make dodgy maths if they so desire :)

MichaelClerx commented 4 years ago

Well it's not just that, it means that any program dealing with CellML needs to implement this.

And it kinda means that we have to have a boolean type. Without true and false you could use the number hack and say if(0,..) doesn't happen while if(anything-else,...) does. Once there's a <true> and a <false> that becomes trickier because what do <true> and <false> mean then? You can't have x = <true> mV, for example. So it makes it more urgent for CellML to define that there's a number type and a boolean type, of which <true> and <false> are the only members...

kerimoyle commented 4 years ago

I take your point, but it's always going to be possible to misuse units and variables. You can have an equation which has one variable to the power of another, which could have a dimension ... just as you could have x = -100 Kelvin or other things which make no physical sense. What was the objection to having a boolean units item instead of a separate variable type? I wasn't in on that discussion?

MichaelClerx commented 4 years ago

I think those are three very different cases :D

x = -100 Kelvin is fine syntactically, and not an issue to implement. But rubbish physically
x = log (1 Volt) results in a model that can't be unit checked, or at least will fail unit checking
x = 7 booleans is ... nonsense?

CellML 1.0/1.1 tries to make boolean into a unit, but then they immediatly had to make lots of extra rules to deal with the mess that creates (for example, no variable can have unit booleans, and booleans can't appear inside units definitions). But it also just doesn't make sense

"x = 1 meter" means something. "x = 4 booleans" does not. Or "x = true booleans"? Then "true" has become a number again, so it doesn't even have the boolean unit and we still don't know what "true" means! booleans per second? 4 trues?

MichaelClerx commented 4 years ago

The issue is booleans are fundamentally a type. That's what they are. Trying to squeeze that into the concept of a unit is a recipe for disaster. We might as well decide that numbers are units. Or letters are functions. It makes no sense

MichaelClerx commented 4 years ago

Fwiw, here's what the 1.0 spec says about it

The <true> and <false> elements have units of cellml:boolean, where cellml:boolean is a set of base units defined purely for use in this specification. (Note that users may not define their own cellml:boolean units, as this is not a valid CellML identifier.) cellml:boolean units are not associated with variables or numbers, but can be produced as the result of the application of relational or logical operators, as discussed in Appendix C.3.3.

MichaelClerx commented 4 years ago

I really like CellML, but that bit is awful :D

"defined purely for use in this specification" What does that mean? Do they have that unit only when I'm reading the spec? What about when I'm validating? When I write CellML, does that count? Are we saying the spec is consistent but anything using the spec is not?
The "cellml:boolean" thing is a hack to fix the booleans-are-units hack, and I guess it helps because you're not allowed fill it in as the value of any of the relevant attributes. But it still exists, so does that mean that conceptually I can have a variable called "x:y:z", it's just not allowed in the XML representation? (So e.g. libcellml could have "x:y" as a units name, it just has to complain if someone sticks it in an XML attribute value?)
"cellml:boolean units are not associated with variables or numbers". Which is exactly what units are. The main thing units are for is being associated with variables or numbers, but that's not allowed for this unit.

And see above, it still doesn't solve the problem of what <true> and <false> actually are. It just tells you that they have the world's weirdest units

MichaelClerx commented 4 years ago

Maybe a mathematical way of saying what the difference is is that they have a different algebra? Operators like "plus" are not defined for booleans, nor are functions like "log", while operators like "and" are (which are not defined for real numbers). Bools & numbers are different things, from disjoint sets (so not even like integers and reals, which overlap)

MichaelClerx commented 4 years ago

^^^ That was more than I intended to write. But I hope it shows that, if we don't write the spec carefully on this point, it can end up implying lots of nonsensical things. So while I don't suggest we write an awful lot about it into the spec, we do need to think it <through> so that you can build on the spec without getting into trouble

kerimoyle commented 4 years ago

Ok, that all makes sense (thanks!) but from the other side, what are the consequences to removing true and false from the allowable tags? What's going to break?

MichaelClerx commented 4 years ago

Nothing I can think of!

I'm happy for them to stay if we prefer that, but either way we need to sort this type thing out :-)

kerimoyle commented 4 years ago

So @agarny @hsorby @nickerso @jonc125 ... can I remove <true> and <false> from the spec?

hsorby commented 4 years ago

Are we able to disallow the assignment of logical operators to variables? That is we only allow logical operations inside conditional statements. We would allow: if (x < y) but disallow: z = x < y.

agarny commented 4 years ago

So @agarny @hsorby @nickerso @jonc125 ... can I remove <true> and <false> from the spec?

I would... after making sure that we are not breaking anything in doing so.

Are we able to disallow the assignment of logical operators to variables? That is we only allow logical operations inside conditional statements. We would allow: if (x < y) but disallow: z = x < y.

I would be ok with that. If someone still wanted something like z = x < y, s/he could write something like z = (x < y) ? 1 : 0 instead (well, the CellML equivalent that is!).

jonc125 commented 4 years ago

Well, we support them, and it's easy to do so. Only real use case (apart from implicitly as intermediates) is to hardcode a branch as always/never happening e.g. for testing.

MichaelClerx commented 4 years ago

I guess one reason to keep them in would be that evaluating e.g. 1 == 1 would result in <true>, and it'd be a bit weird if expressions in CellML could return a result that's not in CellML ?

MichaelClerx commented 4 years ago

I'm more concerned about writing up the type stuff than the literals, to be honest :D

kerimoyle commented 4 years ago

I guess one reason to keep them in would be that evaluating e.g. 1 == 1 would result in , and it'd be a bit weird if expressions in CellML could return a result that's not in CellML ?

... and it becomes odd to have <and> and <or> etc expressions without being able to store their parameters anywhere ...

kerimoyle commented 4 years ago

I'm going around in circles on this one now. I'm back at thinking that units are the simplest way. What about:

boolean becomes a built-in units item (like volts) being an overload of dimensionless in the same way as radian and steradian are. This means we can use them in normal equations, and that they can be assigned to variables for storage, manipulation, passing around the place.
Variables with value of zero and units of booleans are interpreted as false (following every other computer language ... )
Variables with non-zero value and units of booleans are interpreted as true (again, following every other language)

We state in the specification that:

that boolean units element cannot be used as the units attribute of a unit element (ie: the boolean unit cannot be used in combination with others)

We state in the interpretation section that:

the MathML tag <true> is interpreted as a constant with value 1 and units boolean
the MathML tab <false> is interpreted as a constant with value 0 and units boolean

I think this means that:

straightforward algebraic stuff will still make sense: a = myBool1*3 + 10 gives a=13 when myBool1 is true and a=10 otherwise
unit comparisons are simple (as it's an overloaded dimensionless unit)
we can store the results of the comparison operators in an intuitive way: myBool = t == 5
we never have to worry about units like boolean/second as it can't be combined
we never have to worry about people calling a = log ( 1 boolean ) as it's dimensionless and we currently allow a=log(3 volts) or b=3^(4 seconds) etc so there's no difference here (they're all nonsense!)
operators for boolean variables like plus and times become defined as we're using a value now ... so plus becomes or, and times becomes and ... again mirroring other language implementations.

I think this gets through most of the points above? The biggest change is interpreting the value of a boolean variable as true or false depending on equality to zero or not. This isn't such a great leap in my mind as it follows the pattern of most programming languages out there already?

MichaelClerx commented 4 years ago

Boolean is not a dimension/unit. It just isn't. It's not a countable thing. 4 booleans? True booleans? It's complete nonsense, sorry. We might as well decide that "letter" is a unit and "a" really means "a letters".

MichaelClerx commented 4 years ago

Variables with non-zero value and units of booleans are interpreted as true (again, following every other language)

I think it's C-based languages mostly, and even there it's a legacy thing

https://en.wikipedia.org/wiki/Boolean_data_type

jonc125 commented 4 years ago

Indeed, it's generally only in languages that don't (or didn't) have a proper boolean concept and so hacked it in as a number.

MichaelClerx commented 4 years ago

Something like log(x) takes a number type as input, so whatever x is, it has to be a number. Booleans are not numbers so can't go in. Same for letters. The fact that x can't be a negative number, or a number that isn't dimensionless, is a restriction on the range of the numbers that can go in. It's a different type of restriction, that depends on the semantics of what log means and can't be caught in a syntax check. To see that char x; log(x) or bool x; log(x) isn't allowed you don't have to look at the value of x. From double x; log(x) you can't tell whether its fine or not until run-time. Different things!

MichaelClerx commented 4 years ago

If we're not doing https://github.com/cellml/cellml-specification/issues/46 then I suppose we can ignore bools as well, so (1 == 1) * 7 and log(true) will stay valid cellml :D

kerimoyle commented 4 years ago

Closing this following discussion this morning, true and false remain in.

cellml / cellml-specification

<true> and <false> in MathML is permitted or not? #29