georust / geojson

Library for serializing the GeoJSON vector GIS file format
https://crates.io/crates/geojson
Apache License 2.0
272 stars 59 forks source link

Use generic type for precision #59

Open frewsxcv opened 7 years ago

frewsxcv commented 7 years ago

http://rust-num.github.io/num/num_traits/float/trait.Float.html

frewsxcv commented 6 years ago

ideally we could utilize https://github.com/rust-lang/rfcs/blob/master/text/0213-defaulted-type-params.md (which is not yet stable), so we can provide a nice default and not hurt ergonomics. e.g.:

struct GeoJson<F: Float = f64> { .. }
michaelkirk commented 4 years ago

By default, it looks like serde_json wants to use f64.

https://github.com/serde-rs/json/blob/fd6741f4b0b3fc90a58a6f578e33a9adc6403f3f/src/number.rs#L31

There is an arbitrary_precision feature we might be able to use, but there may be a perf cost.

frewsxcv commented 4 years ago

If https://github.com/georust/geojson/pull/131 merges, we could consider this GitHub Issue to be completed, since a position can be different precision types

Robinlovelace commented 3 years ago

Heads-up I'm looking at this in a project to try to learn Rust: https://github.com/zonebuilders/zonebuilder-rust/issues/13

Any guidance appreciated. FYI this is how it can be done in the language I understand bes, via GEOSt: https://r-spatial.github.io/sf/reference/st_precision.html

michaelkirk commented 3 years ago

Hi @Robinlovelace - just to make sure that I'm understanding, my read of the documentation you've linked to and GEOS's own documentation say that, ultimately in those libraries, all math is done in f64, but that this "precision model" can be applied to intentionally round the input and output to be less accurate.

It wasn't aware that people sometimes want to have less precise results, but apparently it's sometimes useful! Is that in line with what you're trying to accomplish?

To be clear, I think this issue was originally intended to talk more about how the underlying math could be done in something other than f64 - e.g. f32, f128, or maybe more exotic things like arbitrary precision types.

urschrei commented 3 years ago

My understanding is that the primary reason for rounding up is because (Geo)JSON is relatively heavy to push over the wire / process / store, so it's not something you'd be doing if you were calculating anything if it's avoidable. Ideally, the truncation would happen as a penultimate step, before the data's sent to the client (in a client-server scenario), i.e. as late as possible.

Robinlovelace commented 3 years ago

Agreed, good practice to round as late as possible to avoid any compound precision issues, as recommended by @dabreegster. Plan now, outlined here: https://github.com/zonebuilders/zonebuilder-rust/issues/13

And here is my very rough attempt, learning slowly but surely hopefully: https://github.com/zonebuilders/zonebuilder-rust/pull/16/files

Any further suggestions very much appreciated, thank you!

Robinlovelace commented 3 years ago

And in answer to your question @michaelkirk (many thanks for quick response)

It wasn't aware that people sometimes want to have less precise results, but apparently it's sometimes useful! Is that in line with what you're trying to accomplish?

This is as @urschrei says about serving data as .geojson files in web and other apps for responsiveness. Can dig out an issue but I recall @mvl22 saying that 5 d.p. is around 1 m accuracy IIRC.

Robinlovelace commented 3 years ago

Update, 6 d.p. is around ~1m apparent: https://github.com/cyipt/actdev/issues/36

Robinlovelace commented 3 years ago

Heads-up, this was fixed here in case of use/interest: https://github.com/zonebuilders/zonebuilder-rust/pull/16/files as follows:

fn round(poly: &mut Polygon<f64>, precision: usize) {
    // hardcoded 2 d.p. todo: update
    let p = 10_usize.pow(precision.try_into().unwrap()) as f64;
    poly.map_coords_inplace(|&(x, y)| (f64::trunc(x * p) / p, f64::trunc(y * p) / p))
}

pub fn clockboard(
    centerpoint: Point<f64>,
    params: Params,
    boundary: Option<Polygon<f64>>,
) -> Vec<Polygon<f64>> {
    let mut polygons = Vec::new();
    for i in params.distances {
        println!("{}", i);
        let circle = makecircle(centerpoint, i, params.num_vertices);
        polygons.push(circle);
    }

    for polygon in &mut polygons {
        round(polygon, params.precision);
    }

    polygons
}

Works really well, we can now output geojsons with any number of d.p. set by the user (almost) :tada:

TurtIeSocks commented 1 year ago

@Robinlovelace I did some work on implementing precision generics across the entirety of the package, mostly out of curiosity to see if I could figure it out since I'm pretty new to Rust, but I ran into a few snags.

/src/conversion/to_geo_types.rs

Screenshot 2022-12-01 at 3 18 39 PM

This one is probably a simple fix, I just wasn't able to figure out how to successfully pass a generic through the enum.

src/conversion/mod.rs

Screenshot 2022-12-01 at 3 19 41 PM

This also might be a simple one where a trait just needs to be added or something.

src/utils.rs

Screenshot 2022-12-01 at 3 20 15 PM Screenshot 2022-12-01 at 3 23 02 PM

The root of these two seem to be caused by serde only having guarantees for f64? Not sure if there's a way around them.

I wasn't really sure if I should open a PR to try and get help with these. Following this conversation it seems like maybe this is solvable using a roundabout method and not a problem anymore, but the issue is still open... so is this still desirable to add to the lib?

If it is still desirable, I can open a PR so we can try and figure out these last couple of issues. Also any hints on the best way to run a local version of a lib with an existing project would be great for testing purposes :)

Thanks!

Robinlovelace commented 1 year ago

Wow, great to see activity on this, thanks for the heads-up on this @TurtIeSocks! I'll defer to others on the desirability of a PR (my guess: desirable) but good to see some probing of the issue and hints of solutions :pray:

michaelkirk commented 1 year ago

so is this still desirable to add to the lib?

I don't personally have a use case for anything other than f64 geojson, but maybe other folks do. I'll let them chime in.

Also any hints on the best way to run a local version of a lib with an existing project would be great for testing purposes :)

I usually add something like this to the end of my Cargo.toml

[patch.crates-io]
geojson = { path = "path/to/your/local/checkout/geojson" }

Read more here: https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html

edit: fixed typo in code

michaelkirk commented 10 months ago

In light of #234, I wanted to comment here because it's more about the specification than that particular implementation.

The issue description here requests a change, but doesn't really get into why we'd want to make that change. Another commenter alluded to the fact that sometimes a smaller serialized size ("on the wire") is more important than more precision, so I'm going forward with the assumption that that's the problem we're trying to solve: How to voluntarily lose precision so as to have a smaller serializable size. Maybe I'm misunderstanding - let me know if I'm mistaken.

Since serde_json, which this crate is based on, ultimately stores everything as f64, I think our storing of a "generic" numeric type is superficial and feels a little dishonest - we ultimately always go to and from an f64 anyway.

Introducing generics throughout the codebase seems a bit heavy handed while at the same time not actually a very flexible solution to that particular problem. If what you are concerned about is how many digits are being serialized, I'd expect to be able to specify how many decimal digits will be written - as I understand the current approach, it lets you opt into an f32 or an f64, but nowhere in between.

Instead, if what we want is to truncate serialized output, then maybe some API that leverages https://docs.rs/serde_json/latest/serde_json/ser/trait.Formatter.html is a better way to go.

Some related (not really resolved) discussion is here: https://github.com/serde-rs/json/issues/562#issuecomment-653841404

Robinlovelace commented 10 months ago

I'm going forward with the assumption that that's the problem we're trying to solve: How to voluntarily lose precision so as to have a smaller serializable size.

That's a good way of putting it: it's about reducing the size of the .geojson files to speed up page load times etc. Truncating serialized output sounds good to me!

TurtIeSocks commented 10 months ago

I agree that truncating the # of digits is the main goal to reduce http sizes for clients.

My only argument for keeping #234 in favor of a PR that simply truncates the precision is that parts of it did feel "natural" since the rest of the rustgeo libs have the same generic precision option.

Either way, I won't be offended if we close it in favor of one that simply truncates the precision, it did end up being a bit more complex than I originally figured it would be, though that might be just because of my experience with Rust. @michaelkirk