w3c / mathml

MathML4 editors draft
https://w3c.github.io/mathml/
Other
63 stars 19 forks source link

Intent for "by" (as in 3×4 matrix) #486

Closed NSoiffer closed 6 months ago

NSoiffer commented 10 months ago

At the 18 Jan meeting, we discussed @davidfarmer's wish list. One of the entries is "by" using the × operator. This is used to describe the dimensions of matrices and other tabular layout. Maybe other things?

We decided this needed some thought and discussion; hence this issue.

dginev commented 10 months ago

To start the discussion let me list out the obvious ideas and their design tension:

To also invoke a related precedent, the set membership operation is very often spoken with as the preposition in (but not only). Still, it has the Unicode name "element of" and currently two Core entries - element-of and member-of. Awaiting a choice as well I assume.

That's all I can offer for now, hoping for a great idea...

polx commented 10 months ago

Could this be used to describe a rectangle of "two kilometers by three" ?

dginev commented 10 months ago

To Paul's rectangle question, well, row-by-column and side-by-side are specific instances of dimension-by-dimension.

And all of those are specific to 2D objects. In 3D we get, for example

More generally, for a "rank n" tensor we could see an even larger sequence of

$$ d_1 \times d_2 \times d_3 \times d_4 \times \cdots \times d_n $$

davidcarlisle commented 10 months ago

a literal_by is brief but not informative beyond the language word

I think I would use intent="by" as it doesn't have to be just a literal this could be considered a concept like times denoting some kind of dimensional rank usually, but it's fine to be in open I don't think it needs to be in core.

NSoiffer commented 10 months ago

"dimension" is not an awful name for speech ("dimension of 2 and 3" or maybe "dimension of 2 comma 3") but definitely not great. Maybe it is a little better as a prefix concept name ("dimension 2 3"). It does fit with the use of a conceptual name. "by:infix" works for speech for two and higher dimensions, but isn't good as a conceptual name.

It's a good example for our discussion on Feb 1 about naming.

davidcarlisle commented 10 months ago

given that

<mrow><mn>2</mn><mo intent="by">×</mo><mn>3</mn></mrow>

presumably produces the intended outcome, I think we would have to show some major advantages and interoperable behaviour to persuade people to generate more complicated forms such as

<mrow  intent="dimension($a,$b)"><mn arg="a">2</mn><mo>×</mo><mn arg="b">3</mn></mrow>
dginev commented 10 months ago

show some major advantages and interoperable behaviour

We should also do that when advocating to promote non-concept English writing from a literal annotation (_by) to a fake concept annotation (by).

You'll certainly find it used as a keyword - e.g. in C# and kotlin. But currently intent doesn't have a "keyword" category. Is anyone suggesting that we introduce one? If not, _by can only be a literal by design, and use of by is just bad markup.

Once we cast the discussion into advantages between using concepts and literals, it's easier to motivate. There is added value, e.g. matrix-dimension(2,3) can generate both the speech strings "2 by 3" as well as "2 rows by 3 columns". A more abstract dimension wouldn't be able to infer "rows" and "columns", and neither would _by.

The harder question is whether that kind of capability is desirable to build into Core.

davidcarlisle commented 10 months ago

If not,_by can only be a literal by design, and use of by is just bad markup.

No I would disagree and certainly prefer by to _by here. _by is, as you say just a literal string to be read as-is and implies no external definition (or concept) . Conversely by can perfectly well be in a system's concept dictionary or public open concept dictionary so has an external definition that this times symbol is being used for matrix dimension or similar, or whatever other information you want to attach to that concept entry.

davidcarlisle commented 9 months ago

To expand on the point in the above comment, use of literals can help if you want to force a particular speech, especially in conjunction with a silent head such as _( or matrix-dimension:silent( then you can force some literal words such as

<mrow intent="matrix-dimension:silent(1,_row,_by,3,_columns)">...

but if you are using intent on the operator and want to imply that it is a matrix dimension operation, which you just happen to want to speak as by in English then

<mrow><mn>1</mn><mo intent="by">×</mo><mn>3</mn></mrow>

is I think a perfectly good markup.

dginev commented 9 months ago

The two Intent uses of the "by" preposition I can accept on my end are both literals:

<mrow intent="_(1,_by,3)">
  <mn>1</mn><mo>×</mo><mn>3</mn>
</mrow>
<!-- vs -->
<mrow>
  <mn>1</mn><mo intent="_by">×</mo><mn>3</mn>
</mrow>

By the by, it was fun to learn that "6 by 3" can be either 18 or 2, depending on whether the implied operation was "multiplied by" or "divided by". In Bulgarian they never overlap, we have 2 different preposition words - "6 по 3" and "6 на 3". Neither of those are concepts of course.

davidcarlisle commented 9 months ago

@dginev the two you show obviously work but essentially say there is no implied or known meaning and just read the word as-is, which is fine if you are generating this stuff and that's all you know, but must surely be a last resort.

<mn>1</mn><mo intent="by">×</mo><mn>3</mn>

on the contrary allows reference to a system specific concept dictionary that says this construct is being used for matrix size, so it's shorter and more informative.

All three versions are valid so it doesn't really matter which one either of us prefer. The versions you show have no concept, so the core concept dictionary isn't involved, and I wouldn't push for by being added to the core dictionary, so I don't think there is really any action for the group to take or decision to decide.


I don't think "6 by 3" can mean divided by in isolaton but certainly it can given a suitable prefix such as "divide 6 by 3" it can also of course mean not-quite-division-but-using-the-division-syntax-of-leibnitz as in d y by d x

brucemiller commented 9 months ago

@davidcarlisle you're certainly correct that in the current state of things, there's no difference between by and _by. But I think the concern is the pollution of the namespace if we encourage use of such inherently ambiguous concepts as by, particularly if the author is thinking in terms of the word "by" rather than some concept "by". If sometime in the future we wish to promote various concepts from the open dictionary, we'll find that there are likely several incompatible by being used by different "system specific" concept dictionaries.

Of course, given the nature of an open dictionary list, there's not much way of forbidding such eventual conflicts, but I think we should be circumspect in how we encourage it.

davidcarlisle commented 9 months ago

@brucemiller if we are worried about conflicting system specific uses of by we should add it to core so all systems use the same name for that concept. that's always a tradeoff: you make core dictionary smaller you increase the risk of conflicting systems, or of course take the half way house of putting it in "our" published open list so you are not saying it should be implemented but strongly hinting that if it is, you use that name.

I think explicitly marking up that it has no known meaning and using a literal should be the last resort here, but is always available and is better than "guessing" a meaning when you are remediating a document and you don't have any other knowledge other than the provided presentation.

Basically if you know what the symbol means and you are adding intent the end result should never consist purely of literals, as doing that is stoping a system guessing the meaning (as you added intent) and saying you don't know what the meaning is (as you just provided a literal string and no concepts)

I would rather have by as a concept with a woolly concept description such as "various related dimensional uses such as the shape of a matrix or description of a rectangle" than have no concept at all and have _by which means, I don't know what this is, but say "by".

davidcarlisle commented 6 months ago

dimensional-product will be added to core list

NSoiffer commented 6 months ago

Added to the core list