Open tikk3r opened 1 month ago
A Rubbl code change would be needed to integrate the C++ casacore code that adds Dysco support. I didn't include this code in the recent update because it would require a dependency on the external Dysco library, which I wanted to avoid. But I think that such a dependency could be added pretty easily as a Cargo feature, if someone like you had interest in it.
Would you potentially be interested in putting in a PR to add this? I can walk you through the pieces that I think would be needed to get this working; I don't expect that it would require any substantial new code to be written.
I'd have to preface it with that I have no C++ programming experience (coming from just building our software stacks and programming in Python for my day to day stuff otherwise) and am only a novice at Rust still, but I'd be happy to take a stab at it if you think with some walking through it's manageable.
Alright, I fiddled around a bit by looking how the current casacore code is included and handled in build.rs
, and copying over the relevant casacore source files related to dysco into the casacore that comes with rubbl_casatabels
. Using the example in rubbl-rxpackage
I seem to have something working to first order now with a Dysco compressed set:
(pyenv-py3) RUSTFLAGS="-lgsl -lcblas" cargo run
Compiling rubbl-rxpackage v0.0.0-dev.0 (/data1/tikk3r/rust_ms/rubbl-rxpackage)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 4.59s
warning: the following packages contain code that will be rejected by a future version of Rust: nom v5.1.2
note: to see what the problems were, use the option `--future-incompat-report`, or run `cargo report future-incompatibilities --id 1`
Running `target/debug/rubbl-rxpackage`
[src/main.rs:38:9] main_data = [[Complex {
re: 139.75493,
im: 26.690851,
}, Complex {
re: -44.5641,
im: -44.5641,
}, Complex {
re: -33.135002,
im: -18.953083,
}, Complex {
re: 112.04767,
im: 21.399227,
}],
[Complex {
re: 151.26367,
im: 44.43459,
}, Complex {
re: -14.967233,
im: -36.277115,
}, Complex {
re: -47.003956,
im: -33.466644,
}, Complex {
re: 121.267784,
im: 35.623123,
}],
I'm having a bit of trouble getting cargo to pick up on the BLAS and GSL libraries, which the RUSTFLAGS
can fix ad-hoc, but that doesn't sound like a long-term solution to me. I can fork and open a PR with what I have so far however, and go from there.
Great! That's probably half the battle right there.
The proper way to link with system libraries like BLAS or GSL is also through the Cargo build system. Various printouts in the build.rs
script can tell Cargo how to link with such libraries; and for many system libraries there exist -sys
crates on Crates.io that encapsulate the necessary steps for you. I see a blas-sys on Crates.io as well as a GSL-sys, which may suffice. If for some reason those are insufficient, Crates like pkg-config can be used in the build.rs
script to discover external dependencies and emit the proper linkage information.
I am a little curious about where these dependencies are coming from, though. Does Dysco support require linking with the separate libdyscostman
shared library as well? Or is it maybe loaded at runtime with dlopen()
? Either way, transitive dependencies like GSL or BLAS should be automatically pulled in if the Dysco shared library has proper metadata. There should only be a need to link the Rubbl crates with GSL or BLAS if the new C++ code introduces direct dependencies on those libraries.
Once that's all straightened out, the final step would be to make all of this support conditional by use of a Cargo feature. That should be a fairly straightforward bit of plumbing.
Cool. I have opened a draft PR #413 to make it more tractable and easier to follow along (hopefully). I'll have another go at getting Cargo to find the dependencies on its own, and check out those -sys crates for guidance.
I am a little curious about where these dependencies are coming from, though. Does Dysco support require linking with the separate libdyscostman shared library as well? Or is it maybe loaded at runtime with dlopen()? Either way, transitive dependencies like GSL or BLAS should be automatically pulled in if the Dysco shared library has proper metadata. There should only be a need to link the Rubbl crates with GSL or BLAS if the new C++ code introduces direct dependencies on those libraries.
I don't think Dysco is explicitely linked against, as it used to be separate and in my LOFAR containers still I build casacore with -DBUILD_DYSCO=OFF
and then separately build and install Dysco. I don't see an explicit dlopen()
in casacore from a quick search, but I guess it must be dynamically loaded somehow then? Dysco does directly depend on GSL if I understand the terminology there correctly, in the CMake files it searches for GSL and the code uses it explicitely. I'm not sure why it also struggled with BLAS, as that does not seem to be a direct dependency.
Awesome, thank you. Let's plan to shift the discussion to the PR thread.
First of all, great to see a crate enabling interaction with the monstrosity that is a MeasurementSet from Rust in the first place.
I might be overlooking something, but is there perhaps a way to interface with MeasurementSets that are compressed with dysco? LOFAR datasets use this by default. I have it built and on LD_LIBRARY_PATH or forced to load it with LD_PRELOAD, but alas it keeps running into
Is there a way I could get the
rubbl_casatables
to pick up on dysco, or would this require more effort?