Xilinx / video-sdk

https://xilinx.github.io/video-sdk
Other
31 stars 14 forks source link

Nix build environment, versions, compatability, etc #79

Closed robashton closed 1 year ago

robashton commented 1 year ago

Hey,

Quite a broad sweeping set of questions in this "issue", so I'll start off with some context.

We have a product (media/video encoding/streaming/etc platform) that is built and released using a nix build environment. This offers us the pleasure of having a repeatable build process on pretty much any platform or architecture.

For some integrations that has meant building SDKs from source and creating nix derivations/flakes for them, for some integrations because it's just headers and dynamic libraries, we can use the headers and then load the dynamic libraries off the host (assuming they're statically linked themselves).

What this looks like in practicality for Xilinx

We're running a supported operating system (Ubuntu 22.04) on a supported architecutre (x86). For the purposes of checking "is everything correct", we have installed the packages as described here. Sourcing into the environment and running FFmpeg yields in success, and building the examples against the installed libraries also yields in success.

Entering our nix build environment, we immediately have a couple of snags

We can solve this in one of two ways, install the libraries on the host and the runtime will be fine, install the libraries in the nix build environment and it'll all build properly, but the runtime will be borked. Some of that is down to versioning (the nix build env is way ahead of the default ubuntu toolchain), and some of it is down to the linker in the nix build env by design, ignoring anything on the host.

Looking at this from another angle, we have a few avenues to explore

Building Xilinx

Two of those options involve building the video SDK ourselves. No big deal, it's just Cmake right?

Building it in the nix build environment

I started this off by taking the latest tags from each relevant repo (XRT/XRM/XVBM) and creating nix flakes for them. I can get them to build and I can link and build our application against them.

The issue I run into here is which xclbin I should be providing on start-up, presumably the one on my host is ancient ("Could not get info for xclbin file"). That seems to be the only stumbling block for this route, save for that I'm obviously using an unsupported version of the SDK at this point. (On a side note, have you got any documentation about what that xclbin file actually is, it might help me understand better).

Assuming that the problem here is that I'm on latest tags (2023.1, etc), I can drop back to the commits marked by V3 of this repo - but I can't easily get that to build in my nix build environment. Our toolchain there is GCC12 and there seem to be some issues that were resolved in later commits (not to mention a top level CMakelist appearing outside of src which makes things a lot easier when using the default cmake build process in a nix flake). I can bring in stdenv10, but that involves rebuilding the entire toolchain which takes a good few hours because it's not in a binary cache. That's not to mention that older commits in those repos have build processes that have hard coded checks for the OS, which I have to patch out using sed before running them!

Where that leaves us

Boy that's a lot of text, sorry about that. The TLDR is I think there are two (and a half) options for building our platform against Xilinx.

Any tips would be appreciated. I'm more than happy to put the effort into building nix flakes and we're more than happy to open source them so others can use them. I can see others went down this road many moons ago so it's clearly possible? Perhaps all I need is a couple of pointers and I'm good to go? I don't know.

NastoohX commented 1 year ago

Hi, Interesting use case. Admittedly, we don't have a NIX based workflow and to understand some of your concerns may take a number of iterations. I'll be bringing up your use case with our development team and see if there are any suggestions or guidelines. Meanwhile, to clarify a few things, kindly, elaborate on the following: 1- I don't follow the concern regarding having a required dynamic library residing on the host, while fully appreciating this may not be the build system. 2- As with any port effort, dependency on a specific versions of external facilities, e.g., uuid, boost, ..., is an issue; however, somewhat unavoidable. Is it not possible to include the required 3rd party dev packages in a NIX environment? 3- There is somewhat of a harden dependency amongst XRT, xclbin and our FPGA shell, which among other things, prohibits non-prescribed mixture of these elements, e.g., transcode.xclbin, requires XRT 2.11.722 and specific shell version. This probably will eliminate your 1st option. 4- I couldn't follow whether it was a requirement for all libs to be static or if this is somehow advantageous.

Looking forward to your reply. Cheers,

robashton commented 1 year ago

Ah yeah - there is a lot of context to take in I'm afraid, especially if you are unfamiliar with Nix! I am more than happy to elaborate repeatedly until we're all on the same page.

1) There is no specific problem having a required dynamic library on the host, it's just that the nix environment tends to be build against a completely different standard runtime for example (stdc++, etc). Trying to load a dynamic library into an application running inside a built nix environment tends to mean that it can't find its own dependencies (you can play with LD_LIBRARY_PATH a little bit but eventually you'll end up with conflicts).

For example, if I ask LDD about the dependencies for libxrt on the host, I'll get

rob@test1:~/src/norsk$ ldd /opt/xilinx/xrt/lib/libxrt_core.so
        linux-vdso.so.1 (0x00007fff9f1eb000)
        libxrt_coreutil.so.2 => /opt/xilinx/xrt/lib/libxrt_coreutil.so.2 (0x00007f85c7fde000)
        libboost_filesystem.so.1.74.0 => /lib/x86_64-linux-gnu/libboost_filesystem.so.1.74.0 (0x00007f85c7fb2000)
        libuuid.so.1 => /lib/x86_64-linux-gnu/libuuid.so.1 (0x00007f85c7fa9000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f85c7d7f000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f85c7d5d000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f85c7b35000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f85c8200000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f85c7a4e000)

Once I'm in my nix env, that'll be

rob@test1:~/src/norsk$ ldd /opt/xilinx/xrt/lib/libxrt_core.so
        linux-vdso.so.1 (0x00007ffdef3e0000)
        libxrt_coreutil.so.2 => /opt/xilinx/xrt/lib/libxrt_coreutil.so.2 (0x00007f3ccadf6000)
        libboost_filesystem.so.1.74.0 => not found
        libuuid.so.1 => not found
        libstdc++.so.6 => not found
        libgcc_s.so.1 => /nix/store/8xk4yl1r3n6kbyn05qhan7nbag7npymx-glibc-2.35-224/lib/libgcc_s.so.1 (0x00007f3ccadda000)
        libc.so.6 => /nix/store/8xk4yl1r3n6kbyn05qhan7nbag7npymx-glibc-2.35-224/lib/libc.so.6 (0x00007f3ccabd1000)
        /nix/store/8xk4yl1r3n6kbyn05qhan7nbag7npymx-glibc-2.35-224/lib64/ld-linux-x86-64.so.2 (0x00007f3ccb018000)
        libboost_filesystem.so.1.74.0 => not found
        libuuid.so.1 => not found
        libstdc++.so.6 => not found

2) This sorta follows on from 1). I can absolutely include uuid/boost/etc in my nix environment, but if I'm linking against something that has already been built then the chances are that my versions of uuid (well, more specifically, boost) are different from the ones used to build your libraries. This is indeed what happened when I first tried a hybrid setup, I just got runtime errors as it tried to call boost functions that weren't there (or in the expected location).

3) Yeah okay, bit of a shame that. I could absolutely build the specific version of XRT that is compatible with the available xclbin, which is what the existing nix derivation I found was doing, but obviously I ran into issues building that older version of XRT because of the hard coded OS checks and runtime differences. Perhaps this path is worth exploring further despite the initial pain.

4) Static or dynamic doesn't really make much difference in the grand scheme of things, because you'll still end up with runtime issues with stdlib/libc/etc (unless you statically link those which is an idea that is documented as 'wrong' so much that i've never actually done it)

For the moment, to make progress, I've set up a minimal build environment for our product on the host, so I can at least carry on with my integration work. This won't help me get anything that would be remotely releasable but will at least mean I'm not blocked.

robashton commented 1 year ago

Another option is for me to make a derivation that takes all the dynamic libraries you have and patches them to explicitly load their dependencies within nix.

https://nixos.wiki/wiki/Packaging/Binaries

Some of my issues there will be similar with regards to versioning and such, but perhaps worth exploring regardless. Maybe easier than building the libraries inside Nix (from older versions of the source code, it is a lot more portable in recent commits)

NastoohX commented 1 year ago

Hi, I have forwarded your build issues and concerns to our dev team and as soon as I have an actionable response, I'll update this ticket. I think your suggestion on packaging dynamic libraries, is at a very least a good starting point and admittedly it is the area that we can provide some help. Cheers,

NastoohX commented 1 year ago

Hi, Given our other venue of interaction, I am closing this ticket. Kindly, feel free to reach out if needed. Cheers,