georust / wkt

Rust read/write support for well-known text (WKT)
https://crates.io/crates/wkt
Apache License 2.0
50 stars 25 forks source link

Use ryu for faster float-to-string conversion #118

Open kylebarron opened 2 months ago

kylebarron commented 2 months ago

In https://github.com/geoarrow/geoarrow-rs/pull/788 @b4l prototyped a custom implementation of writing to WKT, and he said

It is up to twice as fast as wkt and geo.

It doesn't look like this wkt crate uses ryu, and so I'm curious if that's a large part of why his implementation is faster. https://github.com/dtolnay/ryu seems to be a very popular library, and so it doesn't seem like an unreasonable dependency.

It looks like it should be relatively easy to swap in ryu and test if that's significantly faster? It looks like https://github.com/georust/wkt/blob/c3088cdc730c4d643b8535513ddcc3b0efca5e95/src/types/coord.rs#L37-L44 is the main place where floats are converted to strings.

I suppose the primary issue with using ryu is that ryu's Float trait supports only f32 and f64, while WktNum supports a broader range of types, including integers.

Assuming that ryu is indeed quite a bit faster, I think it's worth special casing f32 and f64 if we can, assuming that most real world data will be f64.

(For maintainability reasons, if geoarrow-rs can use wkt directly, that's much more appealing to me than rolling our own).

Any thoughts?

michaelkirk commented 2 months ago

I think that sounds really promising. I’d prefer not to lose support for integer geometries though. I wonder if there’s a way to accommodate both?

lnicola commented 2 months ago

This was discussed a little in https://github.com/rust-lang/rust/issues/52811.

kylebarron commented 2 months ago

I think that sounds really promising. I’d prefer not to lose support for integer geometries though. I wonder if there’s a way to accommodate both?

I'm not exactly sure how that would work because the existing implementation is using a blanket trait implementation.

So if I try to implement something also on ryu::Float:

impl<T> fmt::Display for Coord<T>
where
    T: WktNum + fmt::Display,
{
    fn fmt(&self, f: &mut fmt::Formatter) -> Result<(), fmt::Error> {
        write!(f, "{} {}", self.x, self.y)?;
        if let Some(z) = self.z {
            write!(f, " {}", z)?;
        }
        if let Some(m) = self.m {
            write!(f, " {}", m)?;
        }
        Ok(())
    }
}

impl<T> fmt::Display for Coord<T>
where
    T: WktNum + ryu::Float + fmt::Display,
{
    fn fmt(&self, f: &mut fmt::Formatter) -> Result<(), fmt::Error> {
        write!(f, "{} {}", self.x, self.y)?;
        if let Some(z) = self.z {
            write!(f, " {}", z)?;
        }
        if let Some(m) = self.m {
            write!(f, " {}", m)?;
        }
        Ok(())
    }
}

it errors with:

conflicting implementations of trait `std::fmt::Display` for type `types::coord::Coord<_>`

Is there some way to use a single implementation and tell whether that T implements ryu::Float?

michaelkirk commented 2 months ago

I don't think we can solve it as phrased.

But maybe we could achieve what we want (fast float WKT while still allowing integer WKT) doing something like we did in geo for our integer vs. float Kernel implementations: https://github.com/georust/geo/blob/68f80f851879dd58f146aae47dc2feeea6c83230/geo/src/lib.rs#L350

kylebarron commented 2 months ago

That would essentially remove the blanket implementation and implement on the raw number types instead?

michaelkirk commented 2 months ago

Right - I think that's a fair trade.

I have yet to encounter someone actually using any of our geo libraries for an exotic numeric type, but I'd be interested in seeing how hard it is to add support for them (or to give them the tools to add support for themselves) if we happen to break things for them.

michaelkirk commented 2 months ago

To be clear, I'm suggesting we could break things for exotic numeric types and wait to see if anybody cares before we put a bunch of effort into supporting it.