bschwind / opencascade-rs

Rust bindings to the OpenCascade CAD Kernel
GNU Lesser General Public License v2.1
118 stars 22 forks source link

Extract `occt-sys` crate (repository) out of `opencascade-sys` #86

Closed strohel closed 1 year ago

strohel commented 1 year ago

The opencascade-sys crate currently does basically 2 things:

  1. compiles vanilla OpenCascade C++ (OCCT) using cmake
  2. builds the Rust binding using cxx magic (has the Rust and C++ generated parts)

Step 1 takes 30 minutes on CI and 5 minutes on my laptop, step 2 takes less than a minute on CI and seconds on my laptop.

Whenever the bindings (Rust or .hpp file) change, whole opencascade-sys needs to be rebuilt. OpenCascade cmake build does support incremental compilation (cmake is super quick if you run in the the same build directory second time), but cargo likes to change the build directory (target/debug/build/occt-sys-<hash>), so we end up recompiling OpenCascade more often than not.

Resolve this by extracting part 1. into a separate crate, named occt-sys (OpenCascade Technology, after upstream repo name). occt-sys does the static build of OpenCascade and nothing extra: no bindings, doesn't even tell cargo to link its libraries.

~This PR goes even one step further: it extracts occt-sys into its own repository (not just another workspace crate).~ ~The second part of this PR is therefore in https://github.com/tonarino/occt-sys/pull/1 (let's keep the high-level discussion here). ~

~This part is up for discussion. Rationale:~

bschwind commented 1 year ago

Is the separate repository needed? It feels a little heavy-handed to be honest. If it's not too difficult I'd like to keep everything together in one repo.

bschwind commented 1 year ago

To be clear I'm all in favor of reducing build times and having the OCCT crate be separate and possibly versioned according to the underlying C++ project. I just don't know what's driving the decision to have a separate repo, and I don't know if it's easy or hard to accomplish what you want to do if we keep all the code in one repo.

tuna-f1sh commented 1 year ago

I guess the main advantage of it being a separate repo rather than another sub-crate is the CI caching? Otherwise the other benefits could be obtained by separating opencascade-sys into occt-sys in this workspace?

Initially I thought that the separate repo would add an extra overhead to adding API (of which a fair amount is missing so is in flux) but I see that is not the case, since the bindings remain in opencascade-sys. Also, we can still add the optional feature of using a local dynamically linked OpenCASCADE #26 - something I updated recently because I was also growing tired of the build time!

The versioning of occt-sys as opposed to the current sub-module does also feel cleaner and clearer which OpenCASCADE release is being used.

bschwind commented 1 year ago

Can we accomplish the same caching by keeping the crate in this repo, but not in the cargo workspace? I suppose that would feel a little weird too.

If it's a huge pain to keep it in the same repo then ultimately I'm okay with splitting it out, I just thought it would be nice to keep all code/issues/PRs together in one place.

strohel commented 1 year ago

Yeah the main reason for separating into own repo is the CI caching. https://github.com/Swatinem/rust-cache specifically disables caching of workspace dependencies [1], and defends the decision in https://github.com/Swatinem/rust-cache/issues/37#issuecomment-944697938

Agreed that a separate repo is suboptimal, but I said to myself that occt-sys is so minimal that I don't expect changing it often (particularly we don't have to touch if even if we change the bindings or change the OCC libraries that we link), and there's the little bonus point of no longer having any git submodules here in opencascade-sys.

But thinking about it, I wonder whether linking the system opencascade dynamically (#87) isn't a better solution for CI build times. ubuntu-latest in GitHub is 22.04, which has opencascade 7.5.1: https://packages.ubuntu.com/source/jammy/opencascade. This would not solve MacOS CI build though (but perhaps we can make it non-blocker for merging?)

In this case we could still do the occt-sys separation, but have it as another workspace package in the repo. That would still help local development iterations.

I could as well look into doing manual caching for occt-sys specifically, but that feels a bit weird: we'd have to partially duplicate work of Swatinem/rust-cache (deriving cache keys, cleaning files that don't have to be cached, ensure the cache doesn't balloon over time...)

Yet another option is have occt-sys in the opencascade-rs repo, but not in the workspace and but specify dependency on in through git = "https://github.com/bschwind/opencascade-rs.git", branch = "main". But this would be least preferred variant for me: it is very confusing, doesn't behave in the expected manner w.r.t. to branches, forks, ...

[1] They talk about workspace dependencies, but according to my tests, they exclude all path dependencies. I tested putting opencascade-sys (at the time) out of workspace and it still wasn't cached.

bschwind commented 1 year ago

Could we potentially do the first part ("extracting part 1. into a separate crate, named occt-sys") but continue using the generic github actions cache instead of swatinem's rust-cache?

If we don't imagine occt-sys changing much over time (which it sounds like it won't), then maybe it'll be fairly easy to cache it with the regular github actions cache.

strohel commented 1 year ago

Yeah, let me keep occt-sys in this repo, and figure out CI caching later.

continue using the generic github actions cache instead of swatinem's rust-cache?

The current naive caching does not really work (even if you re-run the very same job right after, it still takes 30 minutes): the keys conflict between the clippy and test jobs, there is no Cargo.lock after checkout, so the key collapses to e.g. Linux-cargo-registry- (basically a single global key), the cargo index path seems to be off, ...

If we don't imagine occt-sys changing much over time (which it sounds like it won't), then maybe it'll be fairly easy to cache it with the regular github actions cache.

Probably not that bad, but there is some work that Swatinem/rust-cache does that we'd have to replicate (as said above)

strohel commented 1 year ago

I've pushed a version that keeps occt-sys in the repo (even in the workspace). It doesn't help the CI (yet), but it helps local development if you touch wrapper.hpp or lib.rs in opencascade-sys: build times are 10-second with this change.

I kept the "naive GitHub cache" -> Swatinem/rust-cache CI change here. It should be a non-controversial improvement (previous cache didn't work at all, now it works at least for external dependencies). But I can extract it to a separate PR if needed.