nmandery / h3ron

Rust crates for the H3 geospatial indexing system
MIT License
83 stars 12 forks source link

Rewrite C dependency in Rust? #31

Closed LegNeato closed 1 year ago

LegNeato commented 2 years ago

Less of an issue, more questions:

Are there any plans to get rid of the C dependency? How much work do you think it would be? Cmake dep is a huge bummer It would also be nice to use const rather than all the generated tables.

Additionally, as you have the most relevant experience to answer this... the h3 api is very C-shaped with a lot of freestanding functions...if you were to write an idiomatic rust lib from the start, would the API be similar to h3ron or would you do things differently?

nmandery commented 2 years ago

There are no such plans currently, although it would be very helpful to not require cmake and it would also resolve the issues people having compiling for WASM (#21) and on blockchain platforms (#30).

The main reasons for sticking with the C dependency for me would be:

That being said, I would also be in favour of a rust-only crate. What may be worth exploring would be experimenting with tools like c2rust on how good the translation from C to rust is, and how much manual effort has to be put in when updating to newer h3 versions. Everything would be unsafe of course, but the current API wrappers can simply continue to exist. Those are a better fit for a rust-centered API than the single functions.

It also would be feasible to compile the C library using the cc crate by replicating a few build steps in the h3ron-h3-sys crate. That would remove the cmake requirement, but would be of little help for the WASM and blockchain issues. So it may not be worth it.

LegNeato commented 2 years ago

I'll have a lot of time on my hands to devote to open source soon and I'd love to contribute to a pure rust port. Would you want it as part of this project or would you rather me work elsewhere and then potentially combine once I have something to show?

Naively, the c code doesn't look like it would be hard to port or at least keep the C api for tests.

nmandery commented 2 years ago

Wow - sounds great. Just go ahead when you like and start working on a fork of this repo.

Somehow I like the idea of solution where the user of this crate can decide which implementation to use, the rust-port or the C library. That could be accomplished using a feature gate to switch between the h3ron-h3-sys crate and the rust port. The rust port could maybe be a part of the h3ron crate itself.

isaacbrodsky commented 2 years ago

cc'ing @dfellis to this discussion as we've had some discussions about possibly rewriting the C library in Rust. Our primary goal with H3 being written in C was the portability of binding the core library to other languages - JavaScript, Python, Java, etc. which we felt C was the best option for. If bindings could be made to a Rust core library from these other languages, which I think could be the case, it may obviate the need for the C core library.

dfellis commented 2 years ago

Agreed with @isaacbrodsky. A combination of DGGRID being C and Rust being much less mature 6-7 years ago is why we went with C.

We intentionally went with a memory ownership model where the callers of H3 provide the memory blocks to the library that then relinquish control back at the end to make integration with VMs simpler and roughly modeled that on Rust, so it has always been in the back of my mind.

I am confused by the statement that WASM support is hindered by the code being in C, however. Emscripten is designed with C as the first class citizen, and we only haven't done a WASM port due to personal time and API breakage with h3-js. It should be pretty straightforward to make a WASM port and re-expose the C functions for other WASM projects and minor tweaks to h3-js for JS exposure. (Worst case you can use WASM's ability to call JS code to bridge in h3-wasm (or h3-js right now).)

Where the code should live is a different discussion. I personally would like it in the H3 repo, but how that affects 4.0 is something to consider. It could be another barrier to upgrading if the code base is fully rewritten instead of what we have been doing. Perhaps it could live in a crate within the H3 repo and we write compatibility testing and fuzzing to confirm equivalent behavior and switch over in some 4.x release?

It would definitely be a lot easier to write the CLI executable in Rust, along with the advantages of a real dependency management system and the improved type safety.

nmandery commented 2 years ago

Things are evolving quickly here.

I for my part would be very much in favour of an official rust crate.

The concept of having H3 only work on pre-allocated memory blocks is quite good.

As C i pretty much the optimum for creating bindings to other languages, the situation in rust may be a bit worse, although certainly not bad:

@dfellis The WASM issue is caused that - at least what I found when doing a little search - the ABI created when compiling C to WASM appears to be incompatible to the ABI WASM emitted by rust expects. But I do not know many more details on this, there are a few links in #21. The way through JS is just less straight forward and also less comfortable, and brings in some additional communication overhead.

LegNeato commented 2 years ago

Pre-allocated blocks are good in Rust for no_std support, which would be desirable from the Rust side anyway. I've personally never written a no_std rust lib though, just used them.

kylebarron commented 2 years ago

I am confused by the statement that WASM support is hindered by the code being in C, however

I've recently been building a rust-wasm library, so I think I can add a little context here. The issue is that there are two separate WASM targets from rust. There's wasm32-unknown-emscripten and wasm32-unknown-unknown. The vast majority of the rust-wasm ecosystem, projects like wasm-bindgen, js-sys, and wasm-pack are built around wasm32-unknown-unknown. As I understand it, the wasm32-unknown-unknown target generally creates smaller bundles but does not have a default C toolchain. Generally any pure-rust project is able to compile to wasm32-unknown-unknown without changes. Meanwhile I haven't seen any easy-to-use tooling made to work with wasm32-unknown-emscripten.

There is a bit more context in https://github.com/rustwasm/team/issues/291, but it seems like the core maintainers barely have time to keep up with wasm-bindgen as it is.

Last time I tried, I got lots of compilation errors when trying to compile h3-sys to wasm32-unknown-unknown (e.g. with wasm-pack build). I think it worked out of the box with cargo build --target wasm32-unknown-emscripten, but the issue is that only builds the wasm bundle (technically a shared object in the target dir?). I don't know how you'd go from that to actually using it from JS.

nmandery commented 2 years ago

Thank you for this detailed writeup of the C + WASM situation @kylebarron

LegNeato commented 1 year ago

Ok, I haven't had time to try to port this "for real", but I was able to run c2rust on h3 and then manually tweaked it:

https://github.com/LegNeato/unsafe-h3lib

All C-based tests pass except for:

The following tests FAILED:
     91 - testH3Api_test91 (Failed)
     95 - testPolygonToCells_test95 (Failed)
     99 - testLatLng_test99 (Failed)

This is likely because I couldn't use f128 as rust has no built-in support and the f128 crate on crates.io only works on linux.

kylebarron commented 1 year ago

I think @isaacbrodsky may have started hacking on an actual rust port

LegNeato commented 1 year ago

Sweet!

nmandery commented 1 year ago

Interesting. @LegNeato Seems you had to do quite a few manual fixes to reach the working state, so c2rust may not be too great when the source needs to be updated periodically. In general, how would you rate the c2rust experience?

I am happy to hear the actual port of H3 to rust really appears to be in consideration.

LegNeato commented 1 year ago

It actually wasn't too bad to get it working, took me a couple of hours of actual work. It would have been quicker if my find and replace regex skills were better. The main changes / patterns were:

I also did a hacky change to cmake in the h3 project to call out to the Rust binaries so I could leverage the project's testing infra rather than rewrite them in Rust.

In hindsight, I shouldn't have bothered with any clippy or warning stuff and should have just stuck with changes that mattered so it would have been easier to forward-port. I actually thought it would be harder / c2rust would be worse so was just kinda exploring and then it worked 😄 .

I may take another crack at it tomorrow and see if I can't do it in such a way where it could be kept in lock-step with h3. That seems like a better use of time for a couple of hours work than rewriting in Rust right now (though I would love to contribute to that effort some day). Note that it would take more work to have the generated code work under the target wasm32-unknown-unknown as the generated code relies on libc (I think? it didn't immediately build but I haven't done wasm stuff before).

grim7reaper commented 1 year ago

Done 🙂

nmandery commented 1 year ago

Great work!

Looks very complete and wonderful it is already available on crates.io. I will put a link in the README of this repository here to your project. Your implementations solves many issues people are having :+1:

grim7reaper commented 1 year ago

Thanks!

Yup, in term of completeness I'm on par with the current H3 release (4.0). My C binding even pass their test suites.

And clearly, this will make integration in Rust projects smoother (it was one of the goal), while giving a nice perf boost.

I've also made an announcement on Reddit for the visibility, linking to an article that goes into more detail (for those who want to know more).

nmandery commented 1 year ago

Thanks for the link. The performance improvements described in the article look quite impressive.

I guess I will need to find some time to port some of my stuff to your library soon :wink:

grim7reaper commented 1 year ago

Haha thanks, I'm glad to hear that.

Don't hesitate to reach out if you need help (I tried to document it as best as I could, but external/fresh eyes are more than welcome), or have feedback on the API (rough edges, improvements, etc.)

LegNeato commented 1 year ago

@grim7reaper Looks amazing!

allixender commented 1 year ago

Exciting!

nmandery commented 1 year ago

I am closing this issue here now. I linked to h3o from the main README of this repository.