intel / intel-device-plugins-for-kubernetes

Collection of Intel device plugins for Kubernetes
Apache License 2.0
19 stars 202 forks source link

FPGA: support CDI #1745

Closed bart0sh closed 1 month ago

bart0sh commented 1 month ago

This PR uses CDI support for Device Plugins to implement FPGA programming hooks.

CRI-O hooks are no longer needed and removed.

As CDI is currently supported by CRI-O and Containerd, this PR enables FPGA orchestration programmed operation mode for both most used CRI runtimes: CRI-O and Containerd. Previously it was only supported by CRI-O.

Ref: https://github.com/intel/intel-device-plugins-for-kubernetes/issues/1457

mythi commented 1 month ago

@bart0sh thanks! quick question: I believe the word prestart is deprecated in the OCI runtime spec. would it make sense to get the new functionality to follow the official name createRuntime?

bart0sh commented 1 month ago

@bart0sh thanks! quick question: I believe the word prestart is deprecated in the OCI runtime spec. would it make sense to get the new functionality to follow the official name createRuntime?

Thanks for pointing out! I didn't know that. Will try to replace.

bart0sh commented 1 month ago

@mythi renamed in the code. Will update docs if/when all tests pass.

bart0sh commented 1 month ago

@mythi done: https://github.com/intel/intel-device-plugins-for-kubernetes/pull/1745/commits/e58369ed13389b4971d3ae4e22914c3a98b8ad51

mythi commented 1 month ago

@mythi done: e58369e

looks good!

bart0sh commented 1 month ago

@tkatila Can you review this PR please?

bart0sh commented 1 month ago

@tkatila

As this change removes the old prestart hook approach, should the README note that if one still wants to use that way, he or she must use <=0.30.0 version of the FPGA components?

New approach designed to be a replacement for the old one. It shouldn't require any configuration or other changes comparing to the previous one. I'd consider it as an internal change and wouldn't add anything to the README as it might confuse users. However, if you think it's needed, I'd be happy to do that. Just let me know.

tkatila commented 1 month ago

New approach designed to be a replacement for the old one. It shouldn't require any configuration or other changes comparing to the previous one. I'd consider it as an internal change and wouldn't add anything to the README as it might confuse users. However, if you think it's needed, I'd be happy to do that. Just let me know.

In that case, I don't think it's required.

tkatila commented 1 month ago

@uniemimu and/or @hj-johannes-lee can you review this? Seems to require one more approval.

mythi commented 1 month ago

It shouldn't require any configuration or other changes comparing to the previous one.

@bart0sh has it also been unconditionally enabled in kubelet since it was added?

bart0sh commented 1 month ago

@mythi yes, it's graduated to Beta in Kubernetes in 1.29 and since then it's enabled by default.

mythi commented 1 month ago

@mythi yes, it's graduated to Beta in Kubernetes in 1.29 and since then it's enabled by default.

In section "Configuring CRI runtimes" we add some comments how the feature is available. We don't say much about kubelet itself. In theory there's a gap but I don't think it's important at all since by the time we release this, the oldest k8s we "support" is 1.29 which has it enabled.

bart0sh commented 1 month ago

@mythi If we'll release this earlier, I'll add a notice about Kubelet version to the documentation as a separate PR.

@mythi @tkatila Is anything else still needed to merge this PR? I'm asking that this is a show-stopper for graduating the feature to GA in Kubernetes.

tkatila commented 1 month ago

Nothing from my side. Good to go.

bart0sh commented 1 month ago

@kad

in documentation part, write down explicitly requirements for CDI mode: k8s,containerd/cri-o.

CRI-O and Containerd are mentioned here. Would it make sense to add version info for CRI-O, Containerd and Kubernetes there? They only make sense for the hook, so it looks like a good place for me. WDYT?

kad commented 1 month ago

@kad

in documentation part, write down explicitly requirements for CDI mode: k8s,containerd/cri-o.

CRI-O and Containerd are mentioned here. Would it make sense to add version info for CRI-O, Containerd and Kubernetes there? They only make sense for the hook, so it looks like a good place for me. WDYT?

there or in overall FPGA plugin README, stating that programming mode is available for systems with CDI enabled, which means k8s+containerd/cri-o of not less than....