crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.5k stars 1.62k forks source link

Faster Float to String #4308

Closed will closed 7 years ago

will commented 7 years ago

I've made printing floats a lot faster, 1.2-6.37x faster depending on the platform and the float. https://github.com/will/fp

Before showing up with a large patch out of the blue, I wanted to check in and see if that's something the project would want.

ysbaddaden commented 7 years ago

You ported the Grisu3 algorithm? Well done! I mentioned it in #2220 and would love to see this in Crystal :)

bcardiff commented 7 years ago

I think it would be a great addition. 👍

RX14 commented 7 years ago

Just as an aside, you shouldn't do multiple iterations inside a Benchmark.ips report. IPS already optimizes the number of iterations based on warm up runs, and comparing values for single operations is much more intuitive than multiple operations.

asterite commented 7 years ago

@will That's awesome!!

Two additional benefits is that this method won't depend on libc's snprintf anymore (though for Float32 it still does) and an improved accuracy or "correctness", as shown by bench 5e-324. So 👍🎉 for this :-)

RX14 commented 7 years ago

Why is Float32 different?

will commented 7 years ago

@RX14 I added the #ips method ;) . I was doing this when I had 3 things I was benchmarking: (stdlib, new with making a new buffer each time, and new with reusing the buffer) and I wanted to see how much faster things were instead of slower. I didn't want to do math, so I made it so the first one took about a second. That maybe should be an option to ips.

@asterite grisu3 fails on about 0.5% of floats, and you need to fall back to some other method. So unfortunately the sprintf is still there. In the v8 double-conversion library, they fall back to this other method called dragon4 which uses arbitrary precision, but I think that can be left for future improvement.

Float32 support could be added I think. There is some code for that but I haven't looked at it yet. I was also thinking that maybe it wouldn't be so bad to cast up to a float64 first?

asterite commented 7 years ago

I tried to cast to Float64 in the current algorithm but the float changes its value and the end result is different than the original. I don't understand why, but that's what happened. You could probably try it with grisu3 and see what happens. If it works, we should do it :-)

will commented 7 years ago

I'm having trouble with one spec

  1) IO encoding encode prints numbers
     Failure/Error: slice.should eq("0123456789.110.11".encode("UCS-2LE"))

       Expected: Bytes[48, 0, 49, 0, 50, 0, 51, 0, 52, 0, 53, 0, 54, 0, 55, 0, 56, 0, 57, 0, 46, 0, 49, 0, 49, 0, 48, 0, 46, 0, 49, 0, 49, 0]
            got: Bytes[48, 0, 49, 0, 50, 0, 51, 0, 52, 0, 53, 0, 54, 0, 55, 0, 56, 0, 57, 0, 46, 0, 49, 0, 49, 48, 46, 49, 49]

     # spec/std/io/io_spec.cr:778
crystal spec spec/std/io/io_spec.cr:762 # IO encoding encode prints numbers

Where are those 0 bytes coming from?

asterite commented 7 years ago

They come from the UCS-2LE encoding. I think that encoding adds a zero byte after each byte.

It's strange that that spec fails, it doesn't seem to have anything to do with float formatting.

Sija commented 7 years ago

LLVM 3.5 again?

will commented 7 years ago

@Sija good idea to check, I was on llvm 3.6.2, but using 4.0.0 didn't change anything. @asterite my branch is at https://github.com/will/crystal/tree/grisu3 if you want to check. After this encoding thing is figured out, there's some cleanup left, and then I'll send a PR.

will commented 7 years ago

Figured out the issue was I was using IO#write_byte which ignores the encoding.

will commented 7 years ago

Closing to open pr #4333