qudt / qudt-public-repo

QUDT -Quantities, Units, Dimensions and dataTypes - public repository
Other
108 stars 69 forks source link

Change request to avoid ambiguous use of hyphen introduced in 2.1.30 #815

Open fkleedorfer opened 7 months ago

fkleedorfer commented 7 months ago

I am sorry to be extremely late to this party, however:

In Release 2.1.30, a new naming rule was introduced that introduces ambiguity and makes it harder for downstream projects to glean the massive amount of useful information encoded in unit names.

Before the release, a hyphen in the unit localname invariably separated units, or, in the case of -PER-, numerator and denominator - i.e. it was the highest-level separator in the name.

Now, the hyphen can mean either that or, if what immediately precedes it is a qudt:Prefix' localname, it is a prefix to the unit following it, as in the case of Kilo-FT3 ( which means a thousand cubic feet vs KiloFT3, which is cubic kilofeet).

This change requires any parsing of unit names to take this special case into account, which is a rather high cost. If we could use a different separator instead, parsing would be much easier. It is also the introduction of an unprecedented kind of ambiguity in the naming system - before it, each naming system requirement (exponents, qualifiers, etc.) had a 1:1 mapping to a lexical form (number at the end, underscore, etc.). The hyphen having two meanings is a new thing, and I think it was a mistake.

For example, how about a dot, ie Kilo.FT3, or a tilde, ie. Kilo~FT3? Both are safe characters. Maybe not beautiful, but then again it's for a very niche requirement.

steveraysteveray commented 1 month ago

Options include:

Do Qnames allow { or (? (It seems not)

Agreement on Kilo~FT3

steveraysteveray commented 1 month ago

We may have been too hasty with the ~, which is rejected by some tools.

steveraysteveray commented 1 month ago

Other options that seem to pass with Visual Studio and TopBraid Composer:

-- (i.e. two hyphens)

ralphtq commented 1 month ago

Just as we have -PER- we could also have a “break” designation: -OF-?

So this would be Kilo-OF-FT3

steveraysteveray commented 1 month ago

That seems workable to me. @fkleedorfer, thoughts?

fkleedorfer commented 1 month ago

As I say elsewhere in more detail, I think '-OF-' has the same problem as '-', it reuses the highlevel delimiter for units.

But: '_' would work nicely, I think, eg Kilo_FT3. What do you think? We use the underscore for differentiating similar units, seems like this would not even be a new use of the delimiter

steveraysteveray commented 1 month ago

Well, at does at least get accepted by the tools. Strictly speaking the delimiter applies a qualifier of the item before the , with the qualification appearing after the _. To be consistent it would be FT3_Kilo but I definitely am not suggesting that!

I could live with Kilo_FT3.

jhodgesatmb commented 3 weeks ago

As Steve mentioned, QUDT is using '' for qualifiers and I see that as a clash with the '' you suggested for Kilo_FT3 which is not a qualifier but has an entirely different interpretation. It is unfortunately the case that we have a limited number of symbols at our disposal. Maybe we should not be trying to use a symbol at all, but rather flesh out the name. If the normal interpretation of KiloFT3 is (KiloFT)**3, and we want to represent Kilo(FT3), then maybe we should consider KiloCubicFT instead? Just brainstorming here.

On Mon, Jun 10, 2024 at 11:40 AM steveraysteveray @.***> wrote:

Well, at does at least get accepted by the tools. Strictly speaking the delimiter applies a qualifier of the item before the , with the qualification appearing after the _. To be consistent it would be FT3_Kilo but I definitely am not suggesting that!

I could live with Kilo_FT3.

— Reply to this email directly, view it on GitHub https://github.com/qudt/qudt-public-repo/issues/815#issuecomment-2159050810, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATQRWNGH2FJN67X6TTCYUDZGXXKNAVCNFSM6AAAAABJCP7G5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJZGA2TAOBRGA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Jack

fkleedorfer commented 3 weeks ago

Very good point @jhodgesatmb! This way we can keep all the prefix magic in place, and also conceptually, I think you nailed it: cubic feet in this case is more like an entirely new unit, defined to be equal to FT^3, similar to liter. On the QUDTLib side of things, this would be one of the special cases handled in the (not so accurately named) si-base-units.ttl

Let's make it unit:KiloCubicFT

steveraysteveray commented 2 weeks ago

I'm a little uncomfortable with this solution because it opens the door to any ^2 or ^3 unit to have a URI of "SquareXYZ" or "CubicXYZ", etc. in addition to the current XYZ2 or XYZ3 for the URI. I think less violence is done to our naming grammar if we use the underscore. We can discuss on Monday...

fkleedorfer commented 2 weeks ago

I see the point.

We could state that this kind of naming is only allowed if the unit in question is to be used as a "base unit" (for lack of a better term), that can be scaled using prefixes and raised to powers if needed - in any other case, you have to use the existing convention.

steveraysteveray commented 2 weeks ago

It's not clear to me how you would decide "if the unit in question is to be used as a 'base unit' (for lack of a better term), that can be scaled using prefixes".

How hard would it be for your Java code to handle the underscore?

jhodgesatmb commented 2 weeks ago

The suggestion was made because of two things: (1) we were trying too hard to fit a round peg into a square hole in terms of using symbols unsuccessfully to build this URI, and (2) everyone insisted that this was a one-off case. It would not lend itself to other unit models unless they, too, had multiple interpretations. If that were to happen then the second point wasn’t really true to begin with. There really doesn’t appear to be a clean solution to (1) anymore.Jack Hodges, Ph.D.Arbor StudiosOn Jun 21, 2024, at 6:26 AM, steveraysteveray @.***> wrote: It's not clear to me how you would decide "if the unit in question is to be used as a 'base unit' (for lack of a better term), that can be scaled using prefixes". How hard would it be for your Java code to handle the underscore?

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

ralphtq commented 2 weeks ago

The issue raised by Steve can be obviated by a rule that disallows the use of prefixes such as Cubic and Square in the leading position

steveraysteveray commented 2 weeks ago

But that is the suggestion on the table - to allow KiloCubicFT.

steveraysteveray commented 2 weeks ago

Ah, I see. "leading position" being the key phrase.

Still, I'm uncomfortable with two alternative representation of cubic anything, depending on whether there is a prefix before the "cubic". And think about KiloCubicKiloGM. Ouch.

steveraysteveray commented 2 weeks ago

Leading contenders:

a. KiloCubicFT b. Kilo_FT3

Ralph: a-3, b-2 Jack: a-5, b-0 Steve: a-0, b-5 Florian: a-5, b-0

a. wins.