CEED / libCEED

CEED Library: Code for Efficient Extensible Discretizations
https://libceed.org
BSD 2-Clause "Simplified" License
205 stars 47 forks source link

Rust: Creation of QFunctions from C source #1677

Open eliasboegel opened 2 months ago

eliasboegel commented 2 months ago

Hi,

In #1621, it was mentioned that functionality to create QFunctions from C source through the Rust bindings would be reasonably straightforward to implement. This would allow usage of the JIT backends when using the Rust bindings from what I can tell. (With this type of QFunction creation, is it still possible to run non-JIT on CPU?)

First, I would like to hear some general opinions on including this as a feature next to QFunctions from the gallery and QFunctions from closures.

Secondly, given that this probably not high on your priority list, I'm happy to contribute to this (with some guidance) if that would be welcome/helpful.

jeremylt commented 2 months ago

I think this would be a good add. Past deciding upon little interface details, I think the bulk of the work would be 1) Make Rust compile a stand alone C file 2) Pass the function pointer and filepath back to the libCEED C interface 3) Decide how to handle context data

eliasboegel commented 2 months ago

Good to hear!

Questions to your points:

  1. This would be necessary only if one wants to run a C source QFunction on a non-JIT backend, correct?
  2. I am not familiar with how libCEED in general handles QFunctions on the C side. Do libCEED's JIT backends care whether the C source is supplied as a file in its final form at compile time, or whether it is supplied as just a string at runtime? Context: I'm interested in doing some specialization of constants of the C source at runtime for the JIT backends when creating an operator.
jeremylt commented 2 months ago

1) I think it would make the most sense for the interface to take both the function pointer and the path so it's easy to swap back and forth between backends. We could even internally create the function pointer from the path.

2) The GPU backends want a filepath to a file it can open and read, such as https://github.com/CEED/libCEED/blob/main/include/ceed/jit-source/gallery/ceed-poisson3dapply.h The backend already does compile time specification of a lot of the pieces. What did you want to specialize upon in the source file? We usually use the context data (automatically captured by the closure currently) to specialize on details in the user physics.

eliasboegel commented 2 months ago
  1. I agree that it should take both. My question was more about which compilation path are used in which cases.
  2. I was thinking mainly about the number of quadrature points, as well as the number of components, since those may not be known at compile time. I see that e.g. in VectorMassApply, num_components is simply a fixed constant, but Q is a parameter. I suppose parameters can work for this. Another use I was thinking of was implementing an operator for projecting an arbitrary function onto an FE space that takes in a C code snippet, essentially to set user-defined initial conditions as a function of the mesh coordinate. I don't see any other way than inserting a string of C code into a QFunction skeleton if I want to use the libCEED infrastructure to do the projection.
jeremylt commented 2 months ago

The JiT backends know Q and that is a compile time constant for them. The number of components is compile time constant from the basis.

I'm not sure I follow why you wouldn't be able to write the functions you would possibly want to use ahead of time? If it's not possible to do that for some reason, you could make a Rust function to write brand new QFunction source files I suppose?

jrwrigh commented 2 months ago

I see that e.g. in VectorMassApply, num_components is simply a fixed constant, but Q is a parameter.

May not be helpful, but the way we get around this in HONEE and the fluids example is to have qfunction helper that's flexible and then have a series of qfunctions that set the number of components as a compile-time constant. Then we have a function that will return the appropriate qfunction for the number of components desired.

eliasboegel commented 2 months ago

The JiT backends know Q and that is a compile time constant for them. The number of components is compile time constant from the basis.

You're right, in hindsight I don't see a problem for these kinds of constants. My concern came primarily from the way it is done with the Rust closures.

I'm not sure I follow why you wouldn't be able to write the functions you would possibly want to use ahead of time? If it's not possible to do that for some reason, you could make a Rust function to write brand new QFunction source files I suppose?

The only (but realistic) example I have at this time would be specifying a function f(x,y,z) to project on the finite element space as part of a configuration file that is read at runtime. Without touching the QFunction source at runtime in one way or another I don't see a way to inject a custom function at runtime. Emitting or modifying a C source file at runtime would be totally fine for that too - I was just curious as to whether the libCEED backends normally expect a file or just a string that holds the C source code.

May not be helpful, but the way we get around this in HONEE and the fluids example is to have qfunction helper that's flexible and then have a series of qfunctions that set the number of components as a compile-time constant. Then we have a function that will return the appropriate qfunction for the number of components desired.

Interesting to see, thanks for this!

jeremylt commented 2 months ago

Maybe something along the lines of

    pub fn create_from_c_src(
        ceed: &crate::Ceed,
        vlength: usize,
        path: std::path::Path,
        user_f: fn(*mut ::std::os::raw::c_void, bind_ceed::CeedInt, *const *const bind_ceed::CeedScalar, *const *mut bind_ceed::CeedScalar) -> ::std::os::raw::c_int
    ) -> crate::Result<Self> {
    // ...
}

It looks like Rust would really prefer we compile ahead of time. Maybe the cc crate could handle compilation?

eliasboegel commented 2 months ago

cc should be adequate to compile each of the files containing the QFunction, though I'm wondering about the following:

  1. When using the C interface, are /cpu/self backends also compiled ahead of time? If they are not, then I would be very much in favor of finding a way to also compile JiT when using the Rust interface, as otherwise some backends would not support e.g. the use case I mentioned where insertion of a custom line of code into the QFunction is necessary.
  2. Assume the header file containing the QFunction has been successfully compiled, what does the pipeline to obtain a function pointer to it look like on the C side (regardless of whether it was compiled AoT or JiT)? Could you give a brief overview? I'm less familiar with this side and have personally not worked with such a mechanism before.
jeremylt commented 2 months ago

The host QFunction function pointers are compiled ahead of time for every backend. We don't have a way for CPU backends to do JiT right now. If you want to only use JiT functionality, you'll lose all testing on CPU backends.

I don't know what you mean by 'pipeline'? The standard C interface for creating a QFunction accepts a host function pointer. Is that what you're asking?

jedbrown commented 2 months ago

For C (and CPU), the header is included in a corresponding *.c source file that does its registration and field specification. I think the question is how to move that code to Rust. For that, cc would be used in a build.rs. If we still want the qfunction source to live in a *.h, then we need a compilation unit for cc to use. Those compilation unit bodies (basically just include the matching header) could be generated or regular (if trivial) files.

eliasboegel commented 2 months ago

I don't know what you mean by 'pipeline'? The standard C interface for creating a QFunction accepts a host function pointer. Is that what you're asking?

I suppose my question does not really apply as I thought it does for AoT and it is much simpler than I expected.


The original thought as to why I was interested in creating QFunctions from C source was to allow usage of the GPU or more generally JiT backends, since the /cpu/self backends already work through using closures. I'm wondering whether in the first place to support the /cpu/self backends with a from_c_src QFunction creation. Please correct me if I'm wrong here, but my understanding is the following:

If support should be for both JiT and non-JiT backends, then the user is required to:

  1. Use cc to compile their QFunction in their build.rs
  2. Cast the relevant function pointer from the extern declaration to a c_void before handing it off to libCEED.

whereas supporting only JiT backends would only require passing the path and would not require dealing with compiling C files as part of the Rust build process.

Would it make sense to cut some of the complexity from the initial implementation and support only JiT backends to close the capability gap and leave support of /cpu/self backends for later? What are your opinions on this?

jeremylt commented 2 months ago

I'd really, really rather not skip on CPU features.

Cast the relevant function pointer from the extern declaration to a c_void before handing it off to libCEED.

I'm not sure exactly what this part means? There's a single allowable function signature that the user can write and we should be able to specify it?

jedbrown commented 2 months ago

I think build.rs is a better solution and user experience, though it would be possible for /cpu/self to invoke a regular compiler (create a temporary file, compile into a shared library, dlopen() and dlsym() to get the function pointer). That's a crude form of JIT and besides the file system churn, it requires locating the "correct" compiler (which must be present on the target, not just on the host).

eliasboegel commented 2 months ago

I'd really, really rather not skip on CPU features.

Fair enough!

Cast the relevant function pointer from the extern declaration to a c_void before handing it off to libCEED.

I'm not sure exactly what this part means? There's a single allowable function signature that the user can write and we should be able to specify it?

Looking at your porposed interface, the user needs to pass a fn(*mut ::std::os::raw::c_void, bind_ceed::CeedInt, *const *const bind_ceed::CeedScalar, *const *mut bind_ceed::CeedScalar) -> ::std::os::raw::c_int, but this means that it is up to the user to actually declare the QFunction as this type in the extern block and to set up the build script to handle compilation of the QFunction files. Part of that is including libceed-sys for the types, as well as informing cc about ceed.h, which seems difficult when the system feature is not used when building the bindings. I think that makes for a lot of things to get wrong but I also don't have good ideas to move more of these responsibilities from the user application into libCEED right now.

jedbrown commented 2 months ago

If build.rs is compiling the C code, it can also be declaring compatible extern functions on the Rust side. It could use bindgen for that, though it's overkill when there is only one signature that matters.

jeremylt commented 2 months ago

I'm rusty on my Rust, but can we typedef that fn type?

eliasboegel commented 2 months ago

I'm rusty on my Rust, but can we typedef that fn type?

A type alias could work for just the type: type some_name = fn(*mut ::std::os::raw::c_void, bind_ceed::CeedInt, *const *const bind_ceed::CeedScalar, *const *mut bind_ceed::CeedScalar) -> ::std::os::raw::c_int;

Though this is not particularly helpful for the user since the user needs to write a signature that includes the function name and argument names (argument names may just be _):

extern "C" {
    fn this_is_a_qf(_: *mut std::os::raw::c_void, _: bind_ceed::CeedInt, _: *const *const bind_ceed::CeedScalar, _, *const *mut bind_ceed::CeedScalar) -> std::os::raw::c_int;
}

I suppose that it could possibly be done with a macro.

eliasboegel commented 2 months ago

Perhaps the larger problem is informing cc about the include directory for libCEED such that ceed.h is accessible in the QFunction source file. That seems especially difficult (or impossible?) if libCEED was built by the bindings. Essentially, cc would need to be supplied in the users build.rs with a path to target/{profile}/build/libceed-sys-{hash}/out/include/ but I don't know any way to automatically retrieve the right hash, nevermind a way that is simple enough that a user can be reasonably expected to figure it out. I saw in the libceed-sys readme that you've had a similar problem in the past?

jedbrown commented 2 months ago

There could be a libceed-builder crate (to be used by build.rs) that would automate locating and building such things. The README comment is about shared library paths if you disable the static feature. If you're having Cargo build libCEED, there isn't much reason to disable static.

Note that the general situation is similar to DEP_Z_INCLUDE in this example; see also RFC 3028 bindeps.