What can be referenced?

lars-hellstrom commented 7 years ago

The standard says

OpenMath integers, floating point numbers, character strings, bytearrays, applications, binding, attributions can also be encoded as an empty OMR element with an href attribute

Missing from this list are symbols, variables, and errors, but all of those can have id attributes. What's the point of that, if one cannot reference them? Were they simply forgotten in this enumeration, or were they deemed irrelevant?

One may argue that at least symbols and variables are better off not referenced, but at least in the binary encoding it can save quite a lot of bytes to only state a symbol name once and then reference it, so it makes sense to allow the same in OpenMath-XML.

Another curious discovery discovery from staring at this part of the schema: You can't set cdbase on an OME element, whereas you can on the other composite object elements OMA, OMBIND, and OMATTR.

kohlhase commented 7 years ago

I completely agree with Lars, this should be fixed. Assigning myself and Lars to this task.

davidcarlisle commented 7 years ago

for cdbase on OME I think that's just a bug in the schema which should be using compound.attributes not common.attributes, and wouldn't be a change in the intended grammar in the specification

davidcarlisle commented 7 years ago

I can't see any reason not to allow references to symbols and errors

variables might need a bit more care to specify what happens if you reference a bound variable

kohlhase commented 7 years ago

for cdbase on OME I think that's just a bug in the schema which should be using compound.attributes not common.attributes

I agree. Why don't you go ahead and fix this on master; I do not think we need a PR for that.

kohlhase commented 7 years ago

variables might need a bit more care to specify what happens if you reference a bound variable

indeed. You should not be able to reference a bound variable outside of its scope (given alpha-conversion). Actually, I have always seen bound variables as "references" to the "bvar" declaration in the binding object. We should maybe clarify this.

lars-hellstrom commented 7 years ago

@davidcarlisle wrote:

variables might need a bit more care to specify what happens if you reference a bound variable

The cat's always been out of the bag on that matter. You can already reference a compound object containing the variable.

<OMA>
  <OMBIND> <OMS cd="fns1 name="lambda"/>
    <OMBVAR> <OMV name="x"/> </OMBVAR>
    <OMA id="foo"> <OMS cd="arith1" name="plus"/> <OMV name="x"/> <OMI>1</OMI> </OMA>
  </OMBIND>
  <OMR href="#foo"/>
</OMA>

is a convoluted way of expressing x+2, where x is free.

In view of @kohlhase's comment, this may however need to be spelt out.

kohlhase commented 7 years ago

you are right of course. I guess I have to recant my statement from above. I guess we have two ways of interpreting OMR:

a semantic operation (i.e. we reference objects that are meaningful in the current context)
a syntactic operation (i.e. this just syntactically copies material in the XML source over to save space).

I had been thinking of 1. when I made my comment (and I guess David as well), and you proved to me that we can only make sense of 1. if we make an additional constraint that we do not "free variables". That can be specified I guess, but might be complicated and not so useful.

If we think of 2. We do not have a problem, your example is just a more clever way of writing

<OMA>
  <OMBIND> <OMS cd="fns1 name="lambda"/>
    <OMBVAR> <OMV name="x"/> </OMBVAR>
    <OMA id="foo"> <OMS cd="arith1" name="plus"/> <OMV name="x"/> <OMI>1</OMI> </OMA>
  </OMBIND>
  <OMA> <OMS cd="arith1" name="plus"/> <OMV name="x"/> <OMI>1</OMI> </OMA>
</OMA>

which is perfectly reasonable OpenMath, which just happens to have two variables named x, one free and one bound.

Moreover with 2. we will not have any semantic problems with referencing variables (the operation is syntactic after all and well-defined).

But interpretation 2. may be unexpected to readers at first. So we may want to explain that in the standard or in the FAQ.

Needless to say, we still need the acyclicity constraint in interpretation 2. otherwise the OM trees become infinite.

lars-hellstrom commented 7 years ago

If you're looking for a guiding principle here, I think this will serve: alpha-conversion is a property of the abstract object model (as described in Chapter 2). References only appear in specific encodings, so those do not exist at the abstract level; anyone wishing to perform alpha-conversion should first substitute all references, after which there is no longer a problem.

More efficiently, one could also imagine an intermediate model where each object is a DAG such that referenced subobjects simply are shared. Modifying a subobject, as part of alpha-conversion, requires first making an unshared (shallow) copy of it. But those would be implementation details, not material for the standard.

Finally, I notice that Section 3.1.3.2 "Sharing and Bound Variables" of the standard already addresses the issue of referencing a bound variable.

kohlhase commented 7 years ago

Finally, I notice that Section 3.1.3.2 "Sharing and Bound Variables" of the standard already addresses the issue of referencing a bound variable.

hmmm, I wonder who wrote that :-). I had totally forgotten. "reading gives you invaluable advantages" as they say.

kohlhase commented 7 years ago

I made the necessary changes to the omstd2.xml andopenmath2.rnc and pushed.

OpenMath / OMSTD

What can be referenced? #6