Raku / problem-solving

🦋 Problem Solving, a repo for handling problems that require review, deliberation and possibly debate
Artistic License 2.0
70 stars 16 forks source link

Coercing a string with a Roman numeral throws an error #228

Open landyacht opened 4 years ago

landyacht commented 4 years ago

I would expect 'Ⅼ'.Int to return 50, but instead the following error is thrown:

Cannot convert string to number: base-10 number must begin with valid digits or '.' in '⏏Ⅼ' (indicated by ⏏)

This seems to be the case with all other Roman numerals as well. Since Unicode Roman numerals are valid as integer literals, I would expect coercing a string containing only a Roman numeral to an Int would have the same behavior. Furthermore, other, non-Arabic numerals work, such as ٨, so this seems inconsistent.

Altai-man commented 4 years ago

What about MMXVI.Int, should Int spend time calculating it? I am not sure about Arabic numerals, but I assume they don't need additional calculations and can be e.g. table based (fast)? If really necessary, there is a slang (https://github.com/drforr/perl6-slang-roman) which does what you want in compile time.

landyacht commented 4 years ago

@Altai-man full-on parsing of Roman numerals might be too much of a rabbit hole, I agree. However, I would at least expect single-grapheme strings to convert properly. Maybe this belongs on a different peg than Str -> Int coercion, though.

samcv commented 4 years ago

I believe this is caused by the Rakudo parser. The regex it uses to parse numbers does not accept roman numerals, therefore it doesn't get to the point where it gets the unicode values associated with the codepoints. Whether we want to fix it is another question. This occurs for all unicode general category 'Nl', of which roman numerals are. It also occurs for 'No' aka number other (things such as ² don't work either). Only general category 'Nd' seems to work with .Int (maybe this is a good thing?).

You can check this with

for 0..0x10FFFF -> $i {
    if $i.uniprop.starts-with('N') {
        try {
            $i.chr.Int;
            CATCH { say "caught $i GenCat: $i.uniprop()"; .resume }
        }
    }
}