Closed ghost closed 11 months ago
I like this idea. Anyone want to take a crack as modifying the parser to permit such Hexadecimal floats?
` // char hex_float_c_str; // PCRE validator of input: // "[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?" // Test examples // char example_hex_float = "+0x1.921fb54442d18p+0001"; // char example_hex_float = "+0x0.0000000000000p+0000"; // char example_hex_float = "+0x0.0000000000001p+0000"; // char example_hex_float = "+0x1.0p+0000"; // char example_hex_float = "+0x1.0"; // char* example_hex_float = "+0x1";
// Keeping sign separate makes mantissa testing simpler
int sign = 1;
long long numerator = 0;
int mantissa_power_of_2 = 0;
// Keeping sign separate makes power testing simpler
int power_sign = 1;
int power_of_2 = 0;
// Introduce some lenience
bool has_fraction = false;
bool has_power = false;
// Keeping it simpler later in code
for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);
int ch_idx = 0;
if ((hex_float_c_str[ch_idx] == '+') or
(hex_float_c_str[ch_idx] == '-')) {
if (hex_float_c_str[ch_idx] == '-') sign = -1;
ch_idx++;
}
if (example_hex_float.mid(ch_idx, 2) == "0x") {
ch_idx += 2;
while (ch_idx < example_hex_float.length()) {
if ((hex_float_c_str[ch_idx] >= '0') and
(hex_float_c_str[ch_idx] <= '9')) {
numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
ch_idx++;
}
else if ((hex_float_c_str[ch_idx] >= 'a') and
(hex_float_c_str[ch_idx] <= 'f')) {
numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
ch_idx++;
}
else break;
}
}
if ((ch_idx < example_hex_float.length()) and
(hex_float_c_str[ch_idx] == '.')) {
ch_idx++;
// Require fractional part after radix point
has_fraction = true;
if (numerator == 0) {
// Literal must have started with "0x0." ...
// ... i.e. not normalized, therefore ...
mantissa_power_of_2 -= 1;
}
while (ch_idx < example_hex_float.length()) {
if ((hex_float_c_str[ch_idx] >= '0') and
(hex_float_c_str[ch_idx] <= '9')) {
numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
mantissa_power_of_2 -= 4;
ch_idx++;
}
else if ((hex_float_c_str[ch_idx] >= 'a') and
(hex_float_c_str[ch_idx] <= 'f')) {
numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
mantissa_power_of_2 -= 4;
ch_idx++;
}
else break;
}
}
if ((ch_idx < example_hex_float.length()) and
(hex_float_c_str[ch_idx] == 'p')) {
ch_idx++;
// Not lenient here, must finish power if started
has_power = true;
if ((hex_float_c_str[ch_idx] == '+') or
(hex_float_c_str[ch_idx] == '-')) {
if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
ch_idx++;
}
while (ch_idx < example_hex_float.length()) {
if ((hex_float_c_str[ch_idx] >= '0') and
(hex_float_c_str[ch_idx] <= '9')) {
power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
ch_idx++;
}
else break;
}
}
// Assemble numerator & denominator
long long denominator = 1;
// Reduction is easy here since the only ...
// ... prime factor of denominator is two
if (has_fraction) {
// Otherwise denominator must already be one
while ((numerator > 1) and
((numerator & 1) == 0) and
// OK, maybe a little lenience
(mantissa_power_of_2 != 0)) {
numerator = numerator >> 1;
mantissa_power_of_2 += 1;
}
}
if ((numerator > 0) and
(has_fraction or has_power)) {
// Only way denominator can be other than one
power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);
long long power_multiplier = 1;
power_multiplier = power_multiplier << abs(power_of_2);
if (power_of_2 > 0) {
numerator = numerator * power_multiplier;
}
else if (power_of_2 < 0) {
denominator = power_multiplier;
}
}
numerator = sign * numerator;
`
Testing usefulness in Calc (my version):
QString test_commands = "";
test_commands += QString("test_value=") + QString::number(numerator) + "/" + QString::number(denominator) + ";";
RPN_Commands_Execute(test_commands);
QString test_result = Trim_Calc_Result(Calc_Evaluate("round(test_value, 32);"));
qDebug() << test_result;
The above code will "overflow" or "underflow" because of the limitations of long long, so: ` // char hex_float_c_str; // PCRE validator of input: // "[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?" // Test examples // char hex_float_c_str = "+0x1.921fb54442d18p+0001"; // char hex_float_c_str = "+0x0.0000000000000p+0000"; // char hex_float_c_str = "+0x0.0000000000001p+0000"; // char hex_float_c_str = "+0x1.0p+0000"; // char hex_float_c_str = "+0x1.0"; // char hex_float_c_str = "+0x1"; // Max: // char hex_float_c_str = "+0x1.fffffffffffffp+1023"; // Min: // char* hex_float_c_str = "+0x1.0000000000000p-1074";
int hex_float_c_str_length = strlen(hex_float_c_str);
// Keeping sign separate makes mantissa testing simpler
int sign = 1;
long long numerator = 0;
int mantissa_power_of_2 = 0;
int mantissa_digit_count = 0;
// Keeping sign separate makes power testing simpler
int power_sign = 1;
int power_of_2 = 0;
// Introduce some lenience
bool has_fraction = false;
bool has_power = false;
for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);
int ch_idx = 0;
if ((hex_float_c_str[ch_idx] == '+') or
(hex_float_c_str[ch_idx] == '-')) {
if (hex_float_c_str[ch_idx] == '-') sign = -1;
ch_idx++;
}
if ((ch_idx < (hex_float_c_str_length - 1)) and
(hex_float_c_str[ch_idx] == '0') and
(hex_float_c_str[ch_idx + 1] == 'x')) {
ch_idx += 2;
while (ch_idx < hex_float_c_str_length) {
if ((hex_float_c_str[ch_idx] >= '0') and
(hex_float_c_str[ch_idx] <= '9')) {
// Prevent overflow, this seriously violates standard ...
if (mantissa_digit_count < 14) {
numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
mantissa_digit_count += 1;
}
else {
// ... but cope by effectively treating ...
// ... remaining digits as zero
mantissa_power_of_2 += 4;
}
ch_idx++;
}
else if ((hex_float_c_str[ch_idx] >= 'a') and
(hex_float_c_str[ch_idx] <= 'f')) {
// Prevent overflow, this seriously violates standard ...
if (mantissa_digit_count < 14) {
numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
mantissa_digit_count += 1;
}
else {
// ... but cope by effectively treating ...
// ... remaining digits as zero
mantissa_power_of_2 += 4;
}
ch_idx++;
}
else break;
}
}
if ((ch_idx < hex_float_c_str_length) and
(hex_float_c_str[ch_idx] == '.')) {
ch_idx++;
// Require fractional part after radix point
has_fraction = true;
if (numerator == 0) {
// Literal must have started with "0x0." ...
// ... i.e. not normalized, therefore ...
mantissa_power_of_2 -= 1;
}
while (ch_idx < hex_float_c_str_length) {
if ((hex_float_c_str[ch_idx] >= '0') and
(hex_float_c_str[ch_idx] <= '9')) {
// Prevent overflow, parses what standard allows
if (mantissa_digit_count < 14) {
numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
mantissa_power_of_2 -= 4;
mantissa_digit_count += 1;
}
ch_idx++;
}
else if ((hex_float_c_str[ch_idx] >= 'a') and
(hex_float_c_str[ch_idx] <= 'f')) {
// Prevent overflow, parses what standard allows
if (mantissa_digit_count < 14) {
numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
mantissa_power_of_2 -= 4;
mantissa_digit_count += 1;
}
ch_idx++;
}
else break;
}
}
if ((ch_idx < hex_float_c_str_length) and
(hex_float_c_str[ch_idx] == 'p')) {
ch_idx++;
// Not lenient here, must finish power if started
has_power = true;
if ((hex_float_c_str[ch_idx] == '+') or
(hex_float_c_str[ch_idx] == '-')) {
if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
ch_idx++;
}
while (ch_idx < hex_float_c_str_length) {
if ((hex_float_c_str[ch_idx] >= '0') and
(hex_float_c_str[ch_idx] <= '9')) {
power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
ch_idx++;
}
else break;
}
}
// Assemble numerator & denominator
long long denominator = 1;
// Reduction is easy here since the only ...
// ... prime factor of denominator is two
if (has_fraction) {
// Otherwise denominator must already be one
while ((numerator > 1) and
((numerator & 1) == 0) and
// OK, maybe a little lenience
(mantissa_power_of_2 != 0)) {
numerator = numerator >> 1;
mantissa_power_of_2 += 1;
}
}
if ((numerator > 0) and
(has_fraction or has_power)) {
// Only way denominator can be other than one
power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);
}
QString test_commands = "test_value=(";
if (sign < 0) test_commands += "-1*";
test_commands += QString::number(numerator);
if (power_of_2 > 0) test_commands += "*2^" + QString::number(power_of_2);
test_commands += ")/(";
test_commands += QString::number(denominator);
if (power_of_2 < 0) test_commands += "*2^" + QString::number(-power_of_2);
test_commands += ");";
RPN_Commands_Execute(test_commands);
QString test_result = Trim_Calc_Result(Calc_Evaluate("estr(test_value);"));
qDebug() << test_result;
` The individual components, mantissa and power, if following standard, will stay within the range of long long. Excess mantissa digits are ignored if after the radix mark or effectively replaced with zeros if before the radix mark.
Added test for "overflow": // char* hex_float_c_str = "+0x1.921fb54442d18abcdefp+0001";
Expand value range of tolerated hex floats by 16X: `void Test_Hexadecimal_Float_Parse ( QString Hexadecimal_Float_String ) { // PCRE validator of input: QRegExp validate_hexadecimal_float = QRegExp("[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?", Qt::CaseInsensitive);
if (validate_hexadecimal_float.exactMatch(Hexadecimal_Float_String)) {
QByteArray example_hex_float_ba = Hexadecimal_Float_String.toLocal8Bit();
char* hex_float_c_str = (char*) malloc(example_hex_float_ba.count() + 10);
strncpy(hex_float_c_str, example_hex_float_ba.data(), example_hex_float_ba.count());
int hex_float_c_str_length = strlen(hex_float_c_str);
// Keeping sign separate makes mantissa testing simpler
int sign = 1;
unsigned long long numerator = 0;
int mantissa_power_of_2 = 0;
int mantissa_digit_count = 0;
// Keeping sign separate makes power testing simpler
int power_sign = 1;
int power_of_2 = 0;
// Introduce some lenience
bool has_fraction = false;
bool has_power = false;
for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);
int ch_idx = 0;
if ((hex_float_c_str[ch_idx] == '+') or
(hex_float_c_str[ch_idx] == '-')) {
if (hex_float_c_str[ch_idx] == '-') sign = -1;
ch_idx++;
}
if ((ch_idx < (hex_float_c_str_length - 1)) and
(hex_float_c_str[ch_idx] == '0') and
(hex_float_c_str[ch_idx + 1] == 'x')) {
ch_idx += 2;
while (ch_idx < hex_float_c_str_length) {
if ((hex_float_c_str[ch_idx] >= '0') and
(hex_float_c_str[ch_idx] <= '9')) {
// Prevent overflow, this seriously violates standard ...
if (mantissa_digit_count < maximum_mantissa_digit_count) {
numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
mantissa_digit_count += 1;
}
else {
// ... but cope by effectively treating ...
// ... remaining digits as zero
mantissa_power_of_2 += 4;
}
ch_idx++;
}
else if ((hex_float_c_str[ch_idx] >= 'a') and
(hex_float_c_str[ch_idx] <= 'f')) {
// Prevent overflow, this seriously violates standard ...
if (mantissa_digit_count < maximum_mantissa_digit_count) {
numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
mantissa_digit_count += 1;
}
else {
// ... but cope by effectively treating ...
// ... remaining digits as zero
mantissa_power_of_2 += 4;
}
ch_idx++;
}
else break;
}
}
if ((ch_idx < hex_float_c_str_length) and
(hex_float_c_str[ch_idx] == '.')) {
ch_idx++;
// Require fractional part after radix point
has_fraction = true;
if (numerator == 0) {
// Literal must have started with "0x0." ...
// ... i.e. not normalized, therefore ...
mantissa_power_of_2 -= 1;
}
while (ch_idx < hex_float_c_str_length) {
if ((hex_float_c_str[ch_idx] >= '0') and
(hex_float_c_str[ch_idx] <= '9')) {
// Prevent overflow, parses what standard allows
if (mantissa_digit_count < maximum_mantissa_digit_count) {
numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
mantissa_power_of_2 -= 4;
mantissa_digit_count += 1;
}
ch_idx++;
}
else if ((hex_float_c_str[ch_idx] >= 'a') and
(hex_float_c_str[ch_idx] <= 'f')) {
// Prevent overflow, parses what standard allows
if (mantissa_digit_count < maximum_mantissa_digit_count) {
numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
mantissa_power_of_2 -= 4;
mantissa_digit_count += 1;
}
ch_idx++;
}
else break;
}
}
if ((ch_idx < hex_float_c_str_length) and
(hex_float_c_str[ch_idx] == 'p')) {
ch_idx++;
// Not lenient here, must finish power if started
has_power = true;
if ((hex_float_c_str[ch_idx] == '+') or
(hex_float_c_str[ch_idx] == '-')) {
if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
ch_idx++;
}
while (ch_idx < hex_float_c_str_length) {
if ((hex_float_c_str[ch_idx] >= '0') and
(hex_float_c_str[ch_idx] <= '9')) {
power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
ch_idx++;
}
else break;
}
}
// Assemble numerator & denominator
unsigned long long denominator = 1;
// Reduction is easy here since the only ...
// ... prime factor of denominator is two
if (has_fraction) {
// Otherwise denominator must already be one
while ((numerator > 1) and
((numerator & 1) == 0) and
// OK, maybe a little lenience
(mantissa_power_of_2 != 0)) {
numerator = numerator >> 1;
mantissa_power_of_2 += 1;
}
}
if ((numerator > 0) and
(has_fraction or has_power)) {
// Only way denominator can be other than one
power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);
}
QString test_commands = "test_value=(";
if (sign < 0) test_commands += "-1*";
test_commands += QString::number(numerator);
if (power_of_2 > 0) test_commands += "*2^" + QString::number(power_of_2);
test_commands += ")/(";
test_commands += QString::number(denominator);
if (power_of_2 < 0) test_commands += "*2^" + QString::number(-power_of_2);
test_commands += ");";
// RPN_Commands_Execute executes the argument command string for its side effects ...
// ... i.e. Calc's internal state (variables).
// It doesn't care about the results unless there is an error.
RPN_Commands_Execute(test_commands);
// Calc_Evaluate executes the argument command string and returns the result
// Trim_Calc_Result strips the syntactic "sugar" from the returned result.
QString test_result_internal = Trim_Calc_Result(Calc_Evaluate("estr(test_value);"));
QString test_result = Trim_Calc_Result(Calc_Evaluate("round(test_value, 32);"));
qDebug() << "/*--------------------*/";
qDebug() << Hexadecimal_Float_String;
qDebug() << test_result;
qDebug() << test_result_internal;
qDebug() << "/*--------------------*/";
free(hex_float_c_str);
}
else {
qDebug() << "Validation Error: " + Hexadecimal_Float_String;
}
}`
Test code:
Test_Hexadecimal_Float_Parse("+0x1.921fb54442d18p+0001"); Test_Hexadecimal_Float_Parse("+0x0.0000000000000p+0000"); Test_Hexadecimal_Float_Parse("+0x0.0000000000001p+0000"); Test_Hexadecimal_Float_Parse("+0x1.0p+0000"); Test_Hexadecimal_Float_Parse("+0x1.0"); Test_Hexadecimal_Float_Parse("+0x1"); // Defined maximum allowable value: Test_Hexadecimal_Float_Parse("+0x1.fffffffffffffp+1023"); // Defined minimum allowable value: Test_Hexadecimal_Float_Parse("+0x1.0000000000000p-1074"); // Test too many hex digits in mantissa: Test_Hexadecimal_Float_Parse("+0x1.921fb54442d18abcdefp+0001");
Test results: /--------------------/ "+0x1.921fb54442d18p+0001" "3.14159265358979311599796346854419" "884279719003555/281474976710656" /--------------------/ /--------------------/ "+0x0.0000000000000p+0000" "0" "0" /--------------------/ /--------------------/ "+0x0.0000000000001p+0000" "0.00000000000000011102230246251565" "1/9007199254740992" /--------------------/ /--------------------/ "+0x1.0p+0000" "1" "1" /--------------------/ /--------------------/ "+0x1.0" "1" "1" /--------------------/ /--------------------/ "+0x1" "1" "1" /--------------------/ /--------------------/ "+0x1.fffffffffffffp+1023" "179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368" "179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368" /--------------------/ /--------------------/ "+0x1.0000000000000p-1074" "0" "1/202402253307310618352495346718917307049556649764142118356901358027430339567995346891960383701437124495187077864316811911389808737385793476867013399940738509921517424276566361364466907742093216341239767678472745068562007483424692698618103355649159556340810056512358769552333414615230502532186327508646006263307707741093494784" /--------------------/ /--------------------/ "+0x1.921fb54442d18abcdefp+0001" "3.14159265358979311599796346854419" "884279719003555/281474976710656" /--------------------/
I've tried to use something approximating usual C style (excepting the use of array notation).
Looking at Calc parsing, it looks to be well beyond my competence to fully integrate this code at the parsing level. And of course, integrated at that level, one could support quadruple hex floats, etc.
Have fun.
@kcrossen May I ask why you submitted all this code as comments? There is a merge request facility?
I think Fabrice Bellard has already done it long ago with his numcal app, along with a lot of other features, check it out here : http://numcalc.com/
I don't understand the bulk (nearly any of) of the parsing code, which makes the usual form of posting this problematic.
Furthermore, I don't know how to use the relevant github tools (which I use for my own code about like I used to use sourceforge).
Calc is maintained on GitHub, not sourceforge. GitHub has lots of good documentation that you should consider.
@kcrossen Github is mostly just git plus a web interface.
We hope to address this, perhaps sometime next month, in a 2.14.1.x non-production release.
This issue will be part of calc v3: see issue #103. Closing this issue so that any further discussion may occur under issue #103
Hexadecimal floats are a formatting for floating point values supported in C since C99. It shows the mantissa in hex. This is useful because it shows the exact number with no rounding or decimal approximation.
E.g. 0.486224 is
0x1.eff2bp+6
.