flux-framework / flux-sched

Fluxion Graph-based Scheduler
GNU Lesser General Public License v3.0
84 stars 39 forks source link

Interface review for resource API #1124

Open vsoch opened 5 months ago

vsoch commented 5 months ago

This issue is intended to review the exposed interfaces for the resource API, with the intention to understand what needs to be exposed (and what does not). I'll tackle this from the standpoint of our primary use case (at least for now) - the fluence plugin that uses the Go bindings and other TBA out of tree plugins that use fluxion.

Level 1: Go Modules

The entrypoint for a Go plugin using the Fluxion Go bindings would be these modules. I don't know the subtle distinction between the cli and module.

reapi_module

fluxmodule

reapi_module imports reapi_module.h, and uses these functions / structures from it:

reapi_cli

fluxcli

reapi_cli imports reapi_cli.h, and uses these functions / structures from it:

Level 2: C/C++ bindings

The Go modules above use the following header files and associated c code. It looks like the C code actually imports some C++ headers / code as well! I didn't know you could do that, TIL! I am guessing this is because we are using cgo? And cgo needs to use C, and fluxion is in C++, so we created the c bindings as an intermediate interface?

reapi_cli.h

The TLDR here is that the reapi_cli.h imports some of the C++ bindings, and it's these C++ bindings that bring in different components from resource (higher up) along with jobinfo and this is why we need to include those shared libraries.

Imports:

I'll stop there because it starts to look like everything is using everything else in resource!

reapi_module.h

TLDR: my impression here is that the distinction might be that this is indeed intended to be a module, meaning it doesn't bring in all the libraries from libresource.so. Was it the case that this module was started but it couldn't meet all the needs that we wanted, so it was left (and then the fluxcli started?)

Imports:

Next Steps

To step back - let's have a discussion about what our goals are (e.g., to create more separation between the API via more scoped functions? To simplify logic / types to be shared between libresource.so and libreapi_cli.so so they can use the same thing (but we expose a much smaller interface?) Let me know if you want me to dig deeper into any of the above - I'm hoping the links to the top level files help you explore as I did.

Example Use Case

Right now to build an out of tree plugin using fluxion (with the Go bindings) I need to both compile and then have the paths to the (non system installed) .so files exposed via LD_LIBRARY_PATH. Here are those things, and I'll show as a diff for what I have to do currently (red) vs what I'd like to do (green).

Compiling

- -L/opt/flux-sched/resource -lfluxion-resource -L/opt/flux-sched/resource/libjobspec -ljobspec_conv -L/opt/flux-sched/resource/reapi/bindings -lreapi_cli -lflux-idset -lstdc++ -lczmq -ljansson -lhwloc -lboost_system -lflux-hostlist -lboost_graph -lyaml-cpp" go build -ldflags '-w' -o bin/icecream src/cmd/main.go
+ -lfluxion-resource -lreapi_cli -lflux-idset -lstdc++ -lczmq -ljansson -lhwloc -lboost_system -lflux-hostlist -lboost_graph -lyaml-cpp" go build -ldflags '-w' -o bin/icecream src/cmd/main.go

And maybe fluxion-resource and reapi_cli are differently named / combined, I'm not sure. But the distinction is that I don't need to tell the linker about paths in the source code of a built flux-sched. They are in default paths somewhere on my system (likely in a container).

Runtime

# These are the flags
# This is what I need to export before I run my binary
+ This should be entirely unnecessary if flux (with the reapi_cli) is installed to a system location like `/lusr/lib`
- export LD_LIBRARY_PATH=/usr/lib:/opt/flux-sched/resource:/opt/flux-sched/resource/reapi/bindings:/opt/flux-sched/resource/libjobspec

# Running the binary!
./bin/icecream -spec icecream.yaml

Questions

And some questions that I have:

Almost forgot! Ping @trws and @milroy from discussion today.