flux-framework / flux-sched

Fluxion Graph-based Scheduler
GNU Lesser General Public License v3.0
84 stars 39 forks source link

Build reapi_cli as a shared library #1120

Closed milroy closed 3 months ago

milroy commented 5 months ago

PR #1094 Updated Fluxion to be build as a static library. Some projects like Fluence are clients of Fluxion and its REAPI and need to link Fluxion libraries as shared libraries to preserve license boundaries. This PR adds the capability to build the necessary shared libraries.

vsoch commented 5 months ago

@milroy this is fantastic! :raised_hands: For a quick example of what this enables, I was (very easily) able to build a go module out of tree:

https://github.com/converged-computing/flex-ice-cream

And specifically, the compile command from the root:

GOOS=linux CGO_CFLAGS="-I/opt/flux-sched/resource/reapi/bindings/c" \
CGO_LDFLAGS="-L/usr/lib -L/opt/flux-sched/resource -lfluxion-resource \
    -L/opt/flux-sched/resource/libjobspec -ljobspec_conv -L//opt/flux-sched/resource/reapi/bindings \
    -lreapi_cli -lflux-idset -lstdc++ -lczmq -ljansson -lhwloc -lboost_system -lflux-hostlist \
    -lboost_graph -lyaml-cpp" go build -ldflags '-w' -o bin/server src/fluence/cmd/main.go

For go libraries, it is much nicer (and expected by developers) to have go.mod and go.sum in the root of the repository. I tested the above and found I could use the (new?) libfluxion-resource.so and I still needed to link to JobSpec and reapi cli binding libraries for the compile to work.

This dummy example builds fluence, and I'll quickly be changing that to something that is not fluence (and likely deleting the git history in case there is issue having it in two spots). I used fluence because it was the quickest way to prototype/test the bindings and give feedback here, and could be compared against the "fudged version" from https://github.com/flux-framework/flux-k8s/pull/47. When work is merged here, we will want to use these bindings there (and good to sanity check it will build).

For now I'm using a fork of the work here, and only because I need to change the module name to match the repository in go.mod (go does not allow you to use, for example, a go.mod with flux-framework/flux-sched that is cloned from milroy/flux-sched)

@trws and @grondo I hope you are excited about this, because it means we (and other developers) can easily build cool stuff with fluxion! I am going to make an attempt soon to prototype what I wanted to do with ice cream, and from that we will likely identify some of the missing pieces (go or c bindings here) that are needed. We've already anticipated some of these needs (and I opened issues earlier this week). Thank you in advance for the reviews!

vsoch commented 5 months ago

Update: library is now officially ice-cream-ized: image

https://github.com/converged-computing/flex-ice-cream

It's not perfect, but it's a start! I made sure to nuke the fluence history so there wouldn't be worry about having it there. Going back to sleep, thanks again for the work here and TBA review!

trws commented 5 months ago

Glad it works, it feels a little odd to need to make the cli a shared library? Is there something in there we should move to something meant to be an external interface?

vsoch commented 5 months ago

Here is what we needed in fluence: https://github.com/flux-framework/flux-k8s/blob/105562e4662e86a1b8ce1e19762678a6c0dd6309/src/fluence/fluxion/fluxion.go

Primarily to make a reapi client and then interact with it!

f.cli = fluxcli.NewReapiClient()
milroy commented 3 months ago

Dropping WIP.

I drastically simplified this PR and reduced the API surface by linking the static Fluxion resource in reapi_cli and building reapi_cli as a a shared library. @vsoch let me know if installing libreapi_cli.so works.

vsoch commented 3 months ago

The build works - this is a development container that installs flux-sched from your branch (with the go path tweaked) and then just runs make:

image

The compile command:

GOOS=linux CGO_CFLAGS="-I/opt/flux-sched/resource/reapi/bindings/c" CGO_LDFLAGS="-L/usr/lib -L/opt/flux-sched/resource/libjobspec -ljobspec_conv -lflux-idset -lstdc++ -lczmq -ljansson -lhwloc -lboost_system -lflux-hostlist -lboost_graph -lyaml-cpp -L//opt/flux-sched/resource/reapi/bindings -lreapi_cli" go build -ldflags '-w' -o bin/icecream src/cmd/main.go

And can confirm the .so is installed.

find /usr -name libreapi_cli.so
/usr/lib/flux/libreapi_cli.so

The flux .so is installed to a non-traditional place, so not an issue to add that to LD_LIBRARY_PATH:

./bin/icecream 
./bin/icecream: error while loading shared libraries: libreapi_cli.so: cannot open shared object file: No such file or directory
export LD_LIBRARY_PATH=/usr/lib/flux

But I think we are still missing one library:

./bin/icecream 
./bin/icecream: error while loading shared libraries: libjobspec_conv.so: cannot open shared object file: No such file or directory

Which seems to still be in the install root:

find /opt/flux-sched/ -name libjobspec_conv.so
/opt/flux-sched/resource/libjobspec/libjobspec_conv.so
find /usr -name libjobspec_conv.so
# no output
milroy commented 3 months ago

Which seems to still be in the install root:

Ok, I updated the PR to install libjobspec_conv.so to FLUX_LIB_DIR. I think we need to install the Flux and Fluxion libraries to FLUX_LIB_DIR, but @trws might have an opinion on whether we could provide an override flag.

milroy commented 3 months ago

@vsoch I did some additional simplification and cleanup that further reduces the complexity of linking Fluxion. Can you test again and make sure this works before I squash the fixup commits?

You shouldn't need to link jobspec_conv anymore.

trws commented 3 months ago

Pending the style question and the commit validation getting fixed up I'm good here, and will approve.

grondo commented 3 months ago

I think we need to install the Flux and Fluxion libraries to FLUX_LIB_DIR, but @trws might have an opinion on whether we could provide an override flag.

Just an opinion, but yes, installing Flux libraries with generic names like libjobspec_conv.so and libreapi_cli.so is probably better done under a subdirectory. If you do want to install these in a standard path, you might consider adding a prefix like we do for libflux-* libraries (libflux-core.so, libflux-idset.so, etc.), e.g. libfluxion-jobspec_conv.so and libfluxion-reapi_cli.so.

vsoch commented 3 months ago

Just an opinion, but yes, installing Flux libraries with generic names like libjobspec_conv.so and libreapi_cli.so is probably better done under a subdirectory. If you do want to install these in a standard path, you might consider adding a prefix like we do for libflux-* libraries (libflux-core.so, libflux-idset.so, etc.), e.g. libfluxion-jobspec_conv.so and libfluxion-reapi_cli.so.

Oh just to be clear, I'm okay with it being under flux, and think that's the better design.

trws commented 3 months ago

I think we need to install the Flux and Fluxion libraries to FLUX_LIB_DIR, but @trws might have an opinion on whether we could provide an override flag.

Just an opinion, but yes, installing Flux libraries with generic names like libjobspec_conv.so and libreapi_cli.so is probably better done under a subdirectory. If you do want to install these in a standard path, you might consider adding a prefix like we do for libflux-* libraries (libflux-core.so, libflux-idset.so, etc.), e.g. libfluxion-jobspec_conv.so and libfluxion-reapi_cli.so.

Agreed.

milroy commented 3 months ago

Just an opinion, but yes, installing Flux libraries with generic names like libjobspec_conv.so and libreapi_cli.so is probably better done under a subdirectory. If you do want to install these in a standard path, you might consider adding a prefix like we do for libflux-* libraries (libflux-core.so, libflux-idset.so, etc.), e.g. libfluxion-jobspec_conv.so and libfluxion-reapi_cli.so.

Oh just to be clear, I'm okay with it being under flux, and think that's the better design.

@vsoch do you mean that you're okay with installation of libreapi_cli.so being under FLUX_LIB_DIR?

@grondo regardless of the installation path do you recommend that I add a prefix (libfluxion-reapi_cli.so)?

vsoch commented 3 months ago

@vsoch do you mean that you're okay with installation of libreapi_cli.so being under FLUX_LIB_DIR?

Yes.

milroy commented 3 months ago

Thanks for the feedback and helpful cmake tips, @trws @grondo and @vsoch!

This PR is yet simpler and I thikn addresses all your comments. I've re-requested a final review.

vsoch commented 3 months ago

image

milroy commented 3 months ago

Thanks for the reviews!

Setting MWP.

codecov[bot] commented 3 months ago

Codecov Report

Merging #1120 (bf373e8) into master (af5f62b) will not change coverage. The diff coverage is n/a.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #1120 +/- ## ====================================== Coverage 70.6% 70.6% ====================================== Files 94 94 Lines 12723 12723 ====================================== Hits 8984 8984 Misses 3739 3739 ```