MikeMcl / bignumber.js

A JavaScript library for arbitrary-precision decimal and non-decimal arithmetic
http://mikemcl.github.io/bignumber.js
MIT License
6.68k stars 742 forks source link

Mixed-case numbers not parsed #309

Closed jbrower2 closed 2 years ago

jbrower2 commented 2 years ago

If a string input has mixed-case letters, the value will fail to parse. If this is expected, it should be clarified in the documentation. I dug in to the code and it appears that it's doing a pseudo-case-insensitive match, but will only work if the input is entirely uppercase or lowercase.

Input:

console.log(new BigNumber("0xab"));
console.log(new BigNumber("0xAB"));
console.log(new BigNumber("0xaB"));

Output:

171
171
NaN

Fixing this may prove difficult, considering case-sensitive alphabets exist and should remain supported.

It may be overkill, but I would recommend a solution where alphabets are stored as an object rather than a string, that way we can do analysis of the alphabet when it is constructed and determine metadata like case-sensitivity, etc.

(Side note; not really part of this issue) This above solution may help solve #250. I'm imagining there would exist some DEFAULT_ALPHABET constant. The concern with the solution mentioned in #250 is that the check ALPHABET.slice(0, 10) === '0123456789' would be somewhat slow to add in to a hot path. But if we are able to do something like ALPHABET === DEFAULT_ALPHABET, a reference equality check should be very fast and mitigate this concern.

MikeMcl commented 2 years ago

The behaviour is intended, and personally I do not think it is necessary to make that explicit in the docs.

Surely, ALPHABET === DEFAULT_ALPHABET could only be true if DEFAULT_ALPHABET was exposed to the user (and therefore vulnerable to unwanted change).

I am reluctant to do any more work on the user-defined alphabet functionality anyway. It's a bit niche and left-field, and I don't want it to add much code weight to the library as I don't think the vast majority of users ever use it. I think just adding a note to the docs will suffice for #250.

Of course, I would be happy to give feedback on and potentially accept other ideas/PR's on the matter - and don't get me wrong, I think you are on the right lines in regard to storing alphabets as objects rather than strings.

jbrower2 commented 2 years ago

I think it's hard to say this behavior is intended when it's undocumented and is not in line with how most languages parse hexadecimal literals.

Surely, ALPHABET === DEFAULT_ALPHABET could only be true if DEFAULT_ALPHABET was exposed to the user (and therefore vulnerable to unwanted change).

I think I may have misconstrued what I was intending with this. I was thinking DEFAULT_ALPHABET would be internal to the library, and that ALPHABET === DEFAULT_ALPHABET check would be where the 'base 10' special case already exists (link).

I believe a solution would be fairly trivial to implement. I will give it a try and report back.

MikeMcl commented 2 years ago

I think it's hard to say this behavior is intended when it's undocumented and is not in line with how most languages parse hexadecimal literals.

Other languages do not allow user-defined 'alphabets' that include lower and upper-case versions of letters. In such alphabets, the numerical value of a letter necessarily differs depending on its case, so it seems inappropiate to ever allow mixed-case letters in the same string to have the same numerical value. The use of mixed case values for single-case-alphabet bases is more likely to be a mistake on the users part.

MikeMcl commented 2 years ago

See #310.

Documented the current behaviour in v9.0.2.