Numbers - Githubissues

willismonroe commented 8 months ago

We need to parse numbers written in sexigesimal notation.

I think we should avoid parsing to decimal and back as that will run into major problems with fractions. So all of our operators need to function in sexigesimal notation. That should be easy enough but here's a repository for reference: https://github.com/HrushikeshPawar/Sexagesimal-Calculator.

A short test suite would be useful: 𒐕 = 1 𒐕 = 60 (NB: how do we deal with unknown placevalues, or do we default to "1"?) 𒐕𒐕 = 2 (or 𒐖 = 2, *one glyph) 𒐕𒑱𒐕 = 61 𒌋𒐕 = 11 𒐕𒌋 = 70 𒐕𒌋𒐕 = 71 𒐕𒐕𒌋 = 130

MrLogarithm commented 8 months ago

730d32dba59b486fb071e62ca45c4234bdae0b32 implements a parser that can handle numerals. It also adds a test/ directory with all of the above examples as test cases (run with make test).

Notes:

Does not currently handle 𒐖 (TWO GESH2) or other compound digit characters, only sequences of 𒐕 or 𒌋
Assumes 𒐕 is always 1. To input 60, use 𒐕𒑱

Remaining issue:

𒑱 separator is probably not handled correctly? Both 𒐕𒑱𒌋𒐕 and 𒐕𒌋𒐕 are recognized as 71, where the former should probably be 3611 (?)

MrLogarithm commented 8 months ago

Should 𒌋𒑱𒌋𒐕 be read as 10,0,11 or 10,11 or both?

MrLogarithm commented 8 months ago

Assuming 𒌋𒑱𒌋𒐕 is 10,0,11, this should be(?) resolved by 58ffcb1.

MrLogarithm commented 8 months ago

Should we support negative numbers, given that there was no conception of these historically? I feel a language without negatives is more in line with the spirit of SIGBOVIK.

MrLogarithm commented 8 months ago

I'm realizing there is no way to directly encode some numbers with the grammar as it exists now: e.g. 50,1 (which I think should be 𒐐𒑱𒐕 where the 𒑱 is a delimiter, and not a true zero??) can't be directly input since 𒐐𒑱𒐕 gets parsed as 50,0,1. Do we need to find a way to overload 𒑱, or is there another character that can be used as a delimiter? Or do we say this is fine, and let the user construct 50,1 via addition 𒐐𒑱 𒐕 + if they need such a number?

(I guess this is where whitespace would have been used as a delimiter originally? but that also introduces ambiguity between consecutive numbers separated by whitespace vs one number with whitespace between digit groups...)

willismonroe commented 8 months ago

I'm looking for good calculations from texts that we can borrow, I think they might make a good test set.

Anyway, as for ambiguous numbers, yes it's certainly going to be an issue. This text seems to imply (at least from the copy) that a space is used to mark a unit difference; 20 a-na 1.07.30 i-ši 22.30

So if we want to include spaces as a delimiter and 𒑱 as a empty unit, what if a double space marks a boundary between tokens?

MrLogarithm commented 8 months ago

Sounds good - the parser currently assumes no whitespace between tokens so we will need to add an explicit " " or WS token to the grammar rules where relevant.

MrLogarithm / emeszida

Numbers #1