DevSolar / pdclib

The Public Domain C Library
https://pdclib.rootdirectory.de
Creative Commons Zero v1.0 Universal
219 stars 41 forks source link

printf (and variants thereof) doesn't support float values #5

Open dracc opened 5 years ago

dracc commented 5 years ago

printf/sprintf/... currently ignores values flagged as float. https://github.com/DevSolar/pdclib/blob/8e10fcf7fd3bac2c5640a86fa165e7fa8c411803/functions/_PDCLIB/_PDCLIB_print.c#L452-L458

It would be very useful to have the capability to print such values. Here's a FreeBSD implementation for reference (thanks to @mborgerson for the link).

DevSolar commented 5 years ago

Indeed floating point support is currently absent from PDCLib, as stated on the status page:

https://pdclib.rootdirectory.de/source#status

The limiting factor is not that I would not know how to do it, but rather that "getting it right" would require various interconnecting pieces of support and configuration code I simply have not come around to implementing. It is also at the bottom of the roadmap, together with et al.

Copying FreeBSD code is not an option, both because of internal structure and the fact that I cannot take BSD-licensed code and republish it as CC0 / public domain.

JayFoxRox commented 5 years ago

I think it would still be good to keep the issue opened, so people know this problem exists (also making it more likely that someone will attempt to implement it). The issue already provides more information than the README, too (which only has an entry from almost 15 years ago).

DevSolar commented 4 years ago

I've been having a closer look at what would be involved in adding floating point support... Dragon4, Grisu3 and all that.

I'll probably correct the existing implementation soon, insofar that it at least consumes its corresponding argument from the stack (which it doesn't, at this point), and in case of *scanf, at least zero-initializes it, to avoid UB issues. A full conversion will take some work.

But I've added corresponding remarks to the project homepage (on both the front / status page and the new "drawing board").

JayFoxRox commented 4 years ago

This issue keeps coming up in discussions about our toolchain (which uses pdclib as default libc). Users are typically surprised by the lack of float support. I feel like this GitHub issue should not have been closed until the support actually lands.

The issue is still valid: floats still don't work in those functions.

We don't expect a fix (even though it would be nice), but just keeping the GitHub issue open (or re-opening it) would help with visibility of this problem.

DevSolar commented 4 years ago

A solution is (obviously) doable, but... what takes priority? or float support for printf()?

JayFoxRox commented 4 years ago

Like stated above: we don't expect a solution for this right now - no code has to be written. We'd just prefer if this GitHub issue was still open (it's currently closed). It should still be opened because:

So all I care about for now is that the issue is reopened and remains open (and visible) until it's fixed.


To answer your question regardless: I think gmtime is easier and more pressing as apps currently use a branch of our pdclib fork which already has the feature (which requires https://github.com/DevSolar/pdclib/pull/12 or https://github.com/XboxDev/nxdk-pdclib/pull/21).

However, the float issue personally affects me more and we don't have a generic workaround. It also seems to come up more often (and most people work around it manually by changing their apps - which is problematic as they'll have to change them back eventually).

But again: I mostly care about this GitHub issue being open / visible to reflect the status of this feature. Work on whatever feature you prefer.

DevSolar commented 4 years ago

A thing I could do is have printf() write something like "float" (ignoring formatting) when encountering a FP conversion specifier, and at least handle the argument list correctly. That wouldn't help with scanf(), though (which would have to be capable of parsing FPs from the input even for dummy functionality).

Would that be a valuable improvement over the current non-implementation?

JayFoxRox commented 4 years ago

I'm not sure. It might overflow strings as it wouldn't respect length of a float (like "%.1f" only being 3 symbols if the float is in range 0.0 to 1.0).

I think at some point we had discussed taking floats from the stack / FPU stack (via va_arg) for these functions to avoid corruptions (I think that's what you mean by "handling argument list correctly"); we should do that. scanf should probably also consume the pointer and set the float to a known value like 0.0f (it's bad, but probably not as bad as random garbage like stack contents that are now being interpreted as float variable).

That said, I personally don't care too much - this issue is visibly documented again (thanks for re-opening!) and if people run into it they'll have to find a solution anyway; I don't think a stub would change that. So I think this should be solved properly sometime soon (sometime this year) - adding temporary hacks is probably just a waste of time.

JayFoxRox commented 3 years ago

A couple of months ago, I watched a video of a talk, about complexity of this kind of algorithm (this one).

It made me think that https://github.com/ulfjack/ryu does look like a good candidate for a reference implementation, as it's a relatively simple algorithm:

DevSolar commented 3 years ago

"Reimplementing under different license" is a very touchy business; I'd rather not go there.

I took great pains to ensure that any reference implementations I used for PDCLib were PD, as to not jeopardize the legality of the package.

I'll have a look at the Ryu algorithm paper once I pick up work on PDCLib again, and perhaps prioritize the printf -- conversion interface so that existing implementations can be dropped in if you don't bother tjat much about the license.

DevSolar commented 3 years ago

Big Integer support (which is required to get a float-to-string conversion going) is "completed" (might be I am missing a function or two, or have implemented functions I will turn out not needing, but there).

I also cleaned up _PDCLIB_print.c a bit so that the actual type conversions / formattings end up in separate translation units (_PDCLIB_print_string.c, _PDCLIB_print_integer.c). This leaves _PDCLIB_print_float.c to be filled with Dragon4 (or Grisu3 if I get ambitious).

DevSolar commented 3 years ago

printf() now works for %a (hexadecimal). I'm now working on strtod() et al., and then scanf(). Once hex support is complete (which is easiest to do as it does not need the more involved conversion math that decimal formats do), I'll start with the decimal stuff.

glebm commented 6 months ago
  1. sqlite has a public domain implementation of printf here: https://github.com/sqlite/sqlite/blob/master/src/printf.c (the core routine for decoding doubles is here: https://github.com/sqlite/sqlite/blob/6161cdd446eec170f7c0edf58d0d72a2cd3c5f85/src/util.c#L974)
  2. sqlite also has a public domain implementation of double parsing here: https://github.com/sqlite/sqlite/blob/6161cdd446eec170f7c0edf58d0d72a2cd3c5f85/src/util.c#L478
DevSolar commented 6 months ago

I will give that implementation a good look with regards to precision and round-trip stability. Something printed with printf() and scanned with strtod() should result in the same binary representation, or you end up with "drifting" values -- which is the whole reason this is such an involved subject that I shelved in 2022 when I thought interest had waned and a new 9-to-5 job took all my attention for a while.

Thanks for prodding me.

glebm commented 5 months ago

It's not exactly what you're talking about but sqlite does have this test for FP conversion that verifies that float->text maintains at least 15 digits of precision: https://github.com/sqlite/sqlite/blob/6161cdd446eec170f7c0edf58d0d72a2cd3c5f85/test/fpconv1.test

DevSolar commented 5 months ago

Thank you for that reference.

As usual when I get prodded back into action, I am making good progress on this, and hopefully don't keep you waiting for much longer.

DevSolar commented 5 months ago

With #cfb71693, I added David M. Gay's dtoa.c to PDCLib, admitting defeat (for now) on the FP conversion front. That should give us strtod() out of the box, with integration into scanf() and printf() the next thing to do. There has not been any detailed testing on the configuration yet, but the weekend is over and I wanted to get this pushed before I return to the 9-to-5 work.

glebm commented 5 months ago

Is dtoa.c that much better than the SQLite implementation?

DevSolar commented 5 months ago

@glebm In a word, yes. ;-) dtoa.c is the implementation used by BSD libc, PHP, Python, and Java, is highly optimized and actively maintained. The only thing that could be better about it would be it being under CC0 license to better mesh with the rest of PDCLib in that regard...

Exploring Binary is full of peer-review, too, providing me with a lot of interesting test cases I will add soon to make sure the configuration is correct.

glebm commented 2 months ago

Checking back on this, what is still missing? Only printf/scanf dispatch?

DevSolar commented 1 month ago

Unfortunately it is a bit more involved than that, "if you want to do it correctly". (The bane of standard functionality.)

To do rounding correctly, the gdtoa code requires a correctly set FLT_ROUNDS (from ). That in turn needs to be updated when the rounding mode is set to a new value (fesetround() in ). Which GCC does not do, as I found out... So I am in the process of implementing the stuff, which is mostly a line of inline assembly surrounded by a bit of boilerplate.

Unfortunately the documentation of those opcodes is rather slim, and I ran into some problems verifying that I did it right on ARM. (The intermediate work I did is in the 'feature' branch on github.) That is where I left off, having some personal matters to attend to.

Once that is done and tested, the results from the gdtoa code need to be appropriately utilized by printf() / scanf(). The output of dtoa is a bit funny (implicit decimal point), so printf() will need some additional logic for setting that (which, in turn, depends on ).

Hope that helps.

glebm commented 1 month ago

Thank you for the update! 🙇

skissane commented 1 month ago

This library being CC0 is a big attraction to some people – other people don't care about that much at all.

For people who care about it being CC0, the fact you've now added dtoa which isn't, is a negative. Whereas, others are more interested in a more accurate/correct implementation than licensing issues, so they'd view your addition of dtoa positively.

Of course, you can't please everybody. But a suggestion: why not add a parallel CC0 implementation (e.g. taken from SQLite as suggested above), with a build-time option to control which one gets used? Then people who want maximum numerical accuracy and correctness can go with dtoa. Whereas people who want pure CC0 can elect to build with the CC0 implementation, and pay the accuracy/correctness price as a result

DevSolar commented 1 month ago

That would indeed be an option at some point. However I did not see how the referenced Sqlite code could be leveraged to get the functionality required for printf / scanf, whereas it was apparent with gdtoa. So as it stands now I did not choose between two viable options, but having the functionality vs. not having it.

I understand that CC0 is more convenient (which is why I put PDCLib under that license), but I find the gdtoa license to be quite agreeable. Do you consider it a showstopper, or is it a mere inconvenience?

skissane commented 1 month ago

I will be honest and say I'm not actually using this library right now, so I can't say it is a "showstopper" for me. But, I am thinking about using it in the future. If I did, I'd probably delete the non-CC0 bit and replace it with something else. Honestly that's more about ideological purism than practicality.

Have you seen this: https://github.com/charlesnicholson/nanoprintf – note parts of its test suite is MIT-licensed, but the library itself is available under CC0. (Personally, I don't care that much about the licensing of tests.)