Closed randommm closed 2 months ago
Thank you for taking this on! A few thoughts on what is here so far
okay, all done
You may have to skip some of the compile tests on Windows for Rust, since we never unload libraries (#111) and Windows locks the file of loaded libraries
i see, what about Mac? it seems to give some weird error that is hard to understand, seems to be C++ toolchain related.
I’ll try to take a look this week - my guess is it has to do with the specific CI environment installing a non-native LLVM
there also seems to be some toolchain problem on Windows, lack of mingw, so it's unable to compile the example.
We could just have a global Mutex that protects the compilation, but that would still fail if multiple independent processes try to compile models in parallel? (@WardBrian Is this something that has been looked into on the stan side?)
This is an unfortunate consequence of Stan really only supporting in-tree builds. This has been frustrating for other reasons, I don't think this specific one has come up though.
The right thing to do if concurrent builds are requested is to do them all in the same call to make
and let make
's jobserver do it, but that seems very difficult to expose in a meaningful way. I guess the compile function could take a list of filenames, but that also seems cumbersome.
- I guess this is why you had to disable the test, or disable testing in parallel
Not exactly, it was just that i needed to delete the .so file before asking the function to compile the model, otherwise the function would skip compilation as the file already exists.
The problem then arises because other tests expect the .so file to exist in the meantime (they don't try to compile it first), so they will fail if they started after the file was deleted.
However, i think that if two processes tried to compile at the same time, i imagine that at least on Linux these wouldn't be a problem (on Windows, idk, conflict of two process opening the same file?).
We could just have a global Mutex that protects the compilation, but that would still fail if multiple independent processes try to compile models in parallel? (@WardBrian Is this something that has been looked into on the stan side?)
This is an unfortunate consequence of Stan really only supporting in-tree builds. This has been frustrating for other reasons, I don't think this specific one has come up though.
The right thing to do if concurrent builds are requested is to do them all in the same call to
make
and letmake
's jobserver do it, but that seems very difficult to expose in a meaningful way. I guess the compile function could take a list of filenames, but that also seems cumbersome.
so right now, if two processes call make at the same time, bad things happen? I haven't tried but imagined that it wouldn't be bad on Linux
I think it would be fine after the initial build? But if nothing has been successfully built yet I think you could definitely get a filesystem race on things like main.o
that are built once and then re-used
Maybe we also want to check the downloaded stan version somehow? If we tie it to a specific release we could just use a checksum I guess? Although I'm actually not sure how we'd best do that given that the file we want to download contains the repo, which would have to include the checksum...
Check commit fab21a7c2a6b0355 for a possible solution. The disadvantage is that will be download an outdated version of bridgestan, so not sure if this can cause a chicken and egg issue in the future in case ABI changes edge cases.
The disadvantage is that will be download an outdated version of bridgestan, so not sure if this can cause a chicken and egg issue in the future in case ABI changes edge cases.
I think this would definitely cause problems. Even lacking a major version bump, this would break even during minor version bumps if the new feature required new functions in the C++ layer.
You could instead do what we do for Julia: We store the SHA of the current version of the source, before updating the Julia code itself in the release action https://github.com/roualdes/bridgestan/blob/b41f83b61571d4744ba5521a73d98a0c51fdd143/.github/workflows/release.yaml#L61-L65
We could also start creating a tarball that is only the C++ (this is probably a good idea for several reasons) which would make this chicken-egg problem easier
finally managed to fix the build on macos now!
@WardBrian @aseyboldt made the API a little more clear now regarding that bridgestan is being download, and also easier to point to a custom bridgestan dir. I think that the checksum and the lock dir would be better fit as others issues/PRs.
Implement BridgeStan download and module compilation on Rust.
Fixes issue #100