roualdes / bridgestan

BridgeStan provides efficient in-memory access through Python, Julia, and R to the methods of a Stan model.
https://roualdes.github.io/bridgestan
BSD 3-Clause "New" or "Revised" License
87 stars 12 forks source link

implement BridgeStan download and module compilation on Rust #212

Closed randommm closed 2 months ago

randommm commented 4 months ago

Implement BridgeStan download and module compilation on Rust.

Fixes issue #100

randommm commented 4 months ago

Thank you for taking this on! A few thoughts on what is here so far

okay, all done

WardBrian commented 4 months ago

You may have to skip some of the compile tests on Windows for Rust, since we never unload libraries (#111) and Windows locks the file of loaded libraries

randommm commented 4 months ago

i see, what about Mac? it seems to give some weird error that is hard to understand, seems to be C++ toolchain related.

WardBrian commented 4 months ago

I’ll try to take a look this week - my guess is it has to do with the specific CI environment installing a non-native LLVM

randommm commented 4 months ago

there also seems to be some toolchain problem on Windows, lack of mingw, so it's unable to compile the example.

WardBrian commented 4 months ago

We could just have a global Mutex that protects the compilation, but that would still fail if multiple independent processes try to compile models in parallel? (@WardBrian Is this something that has been looked into on the stan side?)

This is an unfortunate consequence of Stan really only supporting in-tree builds. This has been frustrating for other reasons, I don't think this specific one has come up though.

The right thing to do if concurrent builds are requested is to do them all in the same call to make and let make's jobserver do it, but that seems very difficult to expose in a meaningful way. I guess the compile function could take a list of filenames, but that also seems cumbersome.

randommm commented 4 months ago
  • I guess this is why you had to disable the test, or disable testing in parallel

Not exactly, it was just that i needed to delete the .so file before asking the function to compile the model, otherwise the function would skip compilation as the file already exists.

The problem then arises because other tests expect the .so file to exist in the meantime (they don't try to compile it first), so they will fail if they started after the file was deleted.

However, i think that if two processes tried to compile at the same time, i imagine that at least on Linux these wouldn't be a problem (on Windows, idk, conflict of two process opening the same file?).

randommm commented 4 months ago

We could just have a global Mutex that protects the compilation, but that would still fail if multiple independent processes try to compile models in parallel? (@WardBrian Is this something that has been looked into on the stan side?)

This is an unfortunate consequence of Stan really only supporting in-tree builds. This has been frustrating for other reasons, I don't think this specific one has come up though.

The right thing to do if concurrent builds are requested is to do them all in the same call to make and let make's jobserver do it, but that seems very difficult to expose in a meaningful way. I guess the compile function could take a list of filenames, but that also seems cumbersome.

so right now, if two processes call make at the same time, bad things happen? I haven't tried but imagined that it wouldn't be bad on Linux

WardBrian commented 4 months ago

I think it would be fine after the initial build? But if nothing has been successfully built yet I think you could definitely get a filesystem race on things like main.o that are built once and then re-used

randommm commented 4 months ago

Maybe we also want to check the downloaded stan version somehow? If we tie it to a specific release we could just use a checksum I guess? Although I'm actually not sure how we'd best do that given that the file we want to download contains the repo, which would have to include the checksum...

Check commit fab21a7c2a6b0355 for a possible solution. The disadvantage is that will be download an outdated version of bridgestan, so not sure if this can cause a chicken and egg issue in the future in case ABI changes edge cases.

WardBrian commented 4 months ago

The disadvantage is that will be download an outdated version of bridgestan, so not sure if this can cause a chicken and egg issue in the future in case ABI changes edge cases.

I think this would definitely cause problems. Even lacking a major version bump, this would break even during minor version bumps if the new feature required new functions in the C++ layer.

You could instead do what we do for Julia: We store the SHA of the current version of the source, before updating the Julia code itself in the release action https://github.com/roualdes/bridgestan/blob/b41f83b61571d4744ba5521a73d98a0c51fdd143/.github/workflows/release.yaml#L61-L65

We could also start creating a tarball that is only the C++ (this is probably a good idea for several reasons) which would make this chicken-egg problem easier

randommm commented 4 months ago

finally managed to fix the build on macos now!

randommm commented 4 months ago

@WardBrian @aseyboldt made the API a little more clear now regarding that bridgestan is being download, and also easier to point to a custom bridgestan dir. I think that the checksum and the lock dir would be better fit as others issues/PRs.