ghost commented 4 years ago

Hexadecimal floats are a formatting for floating point values supported in C since C99. It shows the mantissa in hex. This is useful because it shows the exact number with no rounding or decimal approximation.

E.g. 0.486224 is 0x1.eff2bp+6.

lcn2 commented 3 years ago

I like this idea. Anyone want to take a crack as modifying the parser to permit such Hexadecimal floats?

kcrossen commented 2 years ago

` // char hex_float_c_str; // PCRE validator of input: // "[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?" // Test examples // char example_hex_float = "+0x1.921fb54442d18p+0001"; // char example_hex_float = "+0x0.0000000000000p+0000"; // char example_hex_float = "+0x0.0000000000001p+0000"; // char example_hex_float = "+0x1.0p+0000"; // char example_hex_float = "+0x1.0"; // char* example_hex_float = "+0x1";

    // Keeping sign separate makes mantissa testing simpler
    int sign = 1;
    long long numerator = 0;
    int mantissa_power_of_2 = 0;

    // Keeping sign separate makes power testing simpler
    int power_sign = 1;
    int power_of_2 = 0;

    // Introduce some lenience
    bool has_fraction = false;
    bool has_power = false;

    // Keeping it simpler later in code
    for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
        hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);

    int ch_idx = 0;
    if ((hex_float_c_str[ch_idx] == '+') or
        (hex_float_c_str[ch_idx] == '-')) {
        if (hex_float_c_str[ch_idx] == '-') sign = -1;
        ch_idx++;
    }

    if (example_hex_float.mid(ch_idx, 2) == "0x") {
        ch_idx += 2;
        while (ch_idx < example_hex_float.length()) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < example_hex_float.length()) and
        (hex_float_c_str[ch_idx] == '.')) {
        ch_idx++;
        // Require fractional part after radix point
        has_fraction = true;

        if (numerator == 0) {
            // Literal must have started with "0x0." ...
            // ... i.e. not normalized, therefore ...
            mantissa_power_of_2 -= 1;
        }

        while (ch_idx < example_hex_float.length()) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                mantissa_power_of_2 -= 4;
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                mantissa_power_of_2 -= 4;
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < example_hex_float.length()) and
        (hex_float_c_str[ch_idx] == 'p')) {
        ch_idx++;
        // Not lenient here, must finish power if started
        has_power = true;

        if ((hex_float_c_str[ch_idx] == '+') or
            (hex_float_c_str[ch_idx] == '-')) {
            if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
            ch_idx++;
        }

        while (ch_idx < example_hex_float.length()) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else break;
        }
    }

    // Assemble numerator & denominator
    long long denominator = 1;

    // Reduction is easy here since the only ...
    // ... prime factor of denominator is two
    if (has_fraction) {
        // Otherwise denominator must already be one
        while ((numerator > 1) and
               ((numerator & 1) == 0) and
               // OK, maybe a little lenience
               (mantissa_power_of_2 != 0)) {
            numerator = numerator >> 1;
            mantissa_power_of_2 += 1;
        }
     }

    if ((numerator > 0) and
        (has_fraction or has_power)) {
        // Only way denominator can be other than one
        power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);

        long long power_multiplier = 1;
        power_multiplier = power_multiplier << abs(power_of_2);
        if (power_of_2 > 0) {
            numerator = numerator * power_multiplier;
        }
        else if (power_of_2 < 0) {
            denominator = power_multiplier;
        }
    }

    numerator = sign * numerator;

`

kcrossen commented 2 years ago

Testing usefulness in Calc (my version):

    QString test_commands = "";
    test_commands += QString("test_value=") + QString::number(numerator) + "/" + QString::number(denominator) + ";";
    RPN_Commands_Execute(test_commands);

    QString test_result = Trim_Calc_Result(Calc_Evaluate("round(test_value, 32);"));

    qDebug() << test_result;

kcrossen commented 2 years ago

The above code will "overflow" or "underflow" because of the limitations of long long, so: ` // char hex_float_c_str; // PCRE validator of input: // "[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?" // Test examples // char hex_float_c_str = "+0x1.921fb54442d18p+0001"; // char hex_float_c_str = "+0x0.0000000000000p+0000"; // char hex_float_c_str = "+0x0.0000000000001p+0000"; // char hex_float_c_str = "+0x1.0p+0000"; // char hex_float_c_str = "+0x1.0"; // char hex_float_c_str = "+0x1"; // Max: // char hex_float_c_str = "+0x1.fffffffffffffp+1023"; // Min: // char* hex_float_c_str = "+0x1.0000000000000p-1074";

    int hex_float_c_str_length = strlen(hex_float_c_str);
    // Keeping sign separate makes mantissa testing simpler
    int sign = 1;
    long long numerator = 0;
    int mantissa_power_of_2 = 0;
    int mantissa_digit_count = 0;

    // Keeping sign separate makes power testing simpler
    int power_sign = 1;
    int power_of_2 = 0;

    // Introduce some lenience
    bool has_fraction = false;
    bool has_power = false;

    for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
        hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);

    int ch_idx = 0;
    if ((hex_float_c_str[ch_idx] == '+') or
        (hex_float_c_str[ch_idx] == '-')) {
        if (hex_float_c_str[ch_idx] == '-') sign = -1;
        ch_idx++;
    }

    if ((ch_idx < (hex_float_c_str_length - 1)) and
        (hex_float_c_str[ch_idx] == '0') and
        (hex_float_c_str[ch_idx + 1] == 'x')) {
        ch_idx += 2;
        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == '.')) {
        ch_idx++;
        // Require fractional part after radix point
        has_fraction = true;

        if (numerator == 0) {
            // Literal must have started with "0x0." ...
            // ... i.e. not normalized, therefore ...
            mantissa_power_of_2 -= 1;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == 'p')) {
        ch_idx++;
        // Not lenient here, must finish power if started
        has_power = true;

        if ((hex_float_c_str[ch_idx] == '+') or
            (hex_float_c_str[ch_idx] == '-')) {
            if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
            ch_idx++;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else break;
        }
    }

    // Assemble numerator & denominator
    long long denominator = 1;

    // Reduction is easy here since the only ...
    // ... prime factor of denominator is two
    if (has_fraction) {
        // Otherwise denominator must already be one
        while ((numerator > 1) and
               ((numerator & 1) == 0) and
               // OK, maybe a little lenience
               (mantissa_power_of_2 != 0)) {
            numerator = numerator >> 1;
            mantissa_power_of_2 += 1;
        }
     }

    if ((numerator > 0) and
        (has_fraction or has_power)) {
        // Only way denominator can be other than one
        power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);
    }

    QString test_commands = "test_value=(";
    if (sign < 0) test_commands += "-1*";
    test_commands += QString::number(numerator);
    if (power_of_2 > 0) test_commands += "*2^" + QString::number(power_of_2);
    test_commands += ")/(";
    test_commands += QString::number(denominator);
    if (power_of_2 < 0) test_commands += "*2^" + QString::number(-power_of_2);
    test_commands += ");";
    RPN_Commands_Execute(test_commands);

    QString test_result = Trim_Calc_Result(Calc_Evaluate("estr(test_value);"));

    qDebug() << test_result;

` The individual components, mantissa and power, if following standard, will stay within the range of long long. Excess mantissa digits are ignored if after the radix mark or effectively replaced with zeros if before the radix mark.

kcrossen commented 2 years ago

Added test for "overflow": // char* hex_float_c_str = "+0x1.921fb54442d18abcdefp+0001";

kcrossen commented 2 years ago

Expand value range of tolerated hex floats by 16X: `void Test_Hexadecimal_Float_Parse ( QString Hexadecimal_Float_String ) { // PCRE validator of input: QRegExp validate_hexadecimal_float = QRegExp("[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?", Qt::CaseInsensitive);

if (validate_hexadecimal_float.exactMatch(Hexadecimal_Float_String)) {
    QByteArray example_hex_float_ba = Hexadecimal_Float_String.toLocal8Bit();
    char* hex_float_c_str = (char*) malloc(example_hex_float_ba.count() + 10);
    strncpy(hex_float_c_str, example_hex_float_ba.data(), example_hex_float_ba.count());

    int hex_float_c_str_length = strlen(hex_float_c_str);
    // Keeping sign separate makes mantissa testing simpler
    int sign = 1;
    unsigned long long numerator = 0;
    int mantissa_power_of_2 = 0;

define maximum_mantissa_digit_count 15

    int mantissa_digit_count = 0;

    // Keeping sign separate makes power testing simpler
    int power_sign = 1;
    int power_of_2 = 0;

    // Introduce some lenience
    bool has_fraction = false;
    bool has_power = false;

    for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
        hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);

    int ch_idx = 0;
    if ((hex_float_c_str[ch_idx] == '+') or
        (hex_float_c_str[ch_idx] == '-')) {
        if (hex_float_c_str[ch_idx] == '-') sign = -1;
        ch_idx++;
    }

    if ((ch_idx < (hex_float_c_str_length - 1)) and
        (hex_float_c_str[ch_idx] == '0') and
        (hex_float_c_str[ch_idx + 1] == 'x')) {
        ch_idx += 2;
        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == '.')) {
        ch_idx++;
        // Require fractional part after radix point
        has_fraction = true;

        if (numerator == 0) {
            // Literal must have started with "0x0." ...
            // ... i.e. not normalized, therefore ...
            mantissa_power_of_2 -= 1;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == 'p')) {
        ch_idx++;
        // Not lenient here, must finish power if started
        has_power = true;

        if ((hex_float_c_str[ch_idx] == '+') or
            (hex_float_c_str[ch_idx] == '-')) {
            if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
            ch_idx++;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else break;
        }
    }

    // Assemble numerator & denominator
    unsigned long long denominator = 1;

    // Reduction is easy here since the only ...
    // ... prime factor of denominator is two
    if (has_fraction) {
        // Otherwise denominator must already be one
        while ((numerator > 1) and
               ((numerator & 1) == 0) and
               // OK, maybe a little lenience
               (mantissa_power_of_2 != 0)) {
            numerator = numerator >> 1;
            mantissa_power_of_2 += 1;
        }
     }

    if ((numerator > 0) and
        (has_fraction or has_power)) {
        // Only way denominator can be other than one
        power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);
    }

    QString test_commands = "test_value=(";
    if (sign < 0) test_commands += "-1*";
    test_commands += QString::number(numerator);
    if (power_of_2 > 0) test_commands += "*2^" + QString::number(power_of_2);
    test_commands += ")/(";
    test_commands += QString::number(denominator);
    if (power_of_2 < 0) test_commands += "*2^" + QString::number(-power_of_2);
    test_commands += ");";

    // RPN_Commands_Execute executes the argument command string for its side effects ...
    // ... i.e. Calc's internal state (variables).
    // It doesn't care about the results unless there is an error.
    RPN_Commands_Execute(test_commands);

    // Calc_Evaluate executes the argument command string and returns the result
    // Trim_Calc_Result strips the syntactic "sugar" from the returned result.
    QString test_result_internal = Trim_Calc_Result(Calc_Evaluate("estr(test_value);"));
    QString test_result = Trim_Calc_Result(Calc_Evaluate("round(test_value, 32);"));

    qDebug() << "/*--------------------*/";
    qDebug() << Hexadecimal_Float_String;
    qDebug() << test_result;
    qDebug() << test_result_internal;
    qDebug() << "/*--------------------*/";

    free(hex_float_c_str);
}
else {
    qDebug() << "Validation Error: " + Hexadecimal_Float_String;
}

}`

Test code: Test_Hexadecimal_Float_Parse("+0x1.921fb54442d18p+0001"); Test_Hexadecimal_Float_Parse("+0x0.0000000000000p+0000"); Test_Hexadecimal_Float_Parse("+0x0.0000000000001p+0000"); Test_Hexadecimal_Float_Parse("+0x1.0p+0000"); Test_Hexadecimal_Float_Parse("+0x1.0"); Test_Hexadecimal_Float_Parse("+0x1"); // Defined maximum allowable value: Test_Hexadecimal_Float_Parse("+0x1.fffffffffffffp+1023"); // Defined minimum allowable value: Test_Hexadecimal_Float_Parse("+0x1.0000000000000p-1074"); // Test too many hex digits in mantissa: Test_Hexadecimal_Float_Parse("+0x1.921fb54442d18abcdefp+0001");

Test results: /--------------------/ "+0x1.921fb54442d18p+0001" "3.14159265358979311599796346854419" "884279719003555/281474976710656" /--------------------/ /--------------------/ "+0x0.0000000000000p+0000" "0" "0" /--------------------/ /--------------------/ "+0x0.0000000000001p+0000" "0.00000000000000011102230246251565" "1/9007199254740992" /--------------------/ /--------------------/ "+0x1.0p+0000" "1" "1" /--------------------/ /--------------------/ "+0x1.0" "1" "1" /--------------------/ /--------------------/ "+0x1" "1" "1" /--------------------/ /--------------------/ "+0x1.fffffffffffffp+1023" "179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368" "179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368" /--------------------/ /--------------------/ "+0x1.0000000000000p-1074" "0" "1/202402253307310618352495346718917307049556649764142118356901358027430339567995346891960383701437124495187077864316811911389808737385793476867013399940738509921517424276566361364466907742093216341239767678472745068562007483424692698618103355649159556340810056512358769552333414615230502532186327508646006263307707741093494784" /--------------------/ /--------------------/ "+0x1.921fb54442d18abcdefp+0001" "3.14159265358979311599796346854419" "884279719003555/281474976710656" /--------------------/

I've tried to use something approximating usual C style (excepting the use of array notation).

Looking at Calc parsing, it looks to be well beyond my competence to fully integrate this code at the parsing level. And of course, integrated at that level, one could support quadruple hex floats, etc.

Have fun.

pmetzger commented 2 years ago

@kcrossen May I ask why you submitted all this code as comments? There is a merge request facility?

Saldef commented 2 years ago

I think Fabrice Bellard has already done it long ago with his numcal app, along with a lot of other features, check it out here : http://numcalc.com/

kcrossen commented 2 years ago

I don't understand the bulk (nearly any of) of the parsing code, which makes the usual form of posting this problematic.

kcrossen commented 2 years ago

Furthermore, I don't know how to use the relevant github tools (which I use for my own code about like I used to use sourceforge).

lcn2 commented 2 years ago

Calc is maintained on GitHub, not sourceforge. GitHub has lots of good documentation that you should consider.

pmetzger commented 2 years ago

@kcrossen Github is mostly just git plus a web interface.

lcn2 commented 2 years ago

We hope to address this, perhaps sometime next month, in a 2.14.1.x non-production release.

lcn2 commented 11 months ago

This issue will be part of calc v3: see issue #103. Closing this issue so that any further discussion may occur under issue #103

lcn2 / calc

Enhancement: support hex floats #14

define maximum_mantissa_digit_count 15