elastic / ml-cpp

Machine learning C++ code
Other
149 stars 62 forks source link

[ML] Create a Docker build image using the latest PyTorch stable branch. #2685

Closed edsavage closed 1 month ago

edsavage commented 1 month ago

On Linux x86_64, provide the ability to create Docker build images that use the code pulled from the latest PyTorch stable release branch (currently 2.3.1)

Upon successful build and push of such a Docker image, trigger a build of the ml-cpp code in it.

Once the ml-cpp build has succeeded it in turn should trigger a pipeline in the QAF repo that runs a set of tests exercising the pytorch_inference executable.

To make this possible a new pipeline - ml-cpp-pytorch-build - is required. This is defined in the catalog-info.yaml file and won't be created until catalog-info.yaml is merged to main and backstage magic does its thing.

Some tweaks to our existing Buildkite framework have been made in order that existing code can be better be re-used, so that e.g. just a linux x86_64 build step can be dynamically created.

TBD: The name of the QA PyTorch testing pipeline is required.

cla-checker-service[bot] commented 1 month ago

💚 CLA has been signed

edsavage commented 1 month ago

To manually test the changes in this PR

edsavage commented 1 month ago

buildkite build this

edsavage commented 1 month ago

The PyTorch releases are branched of viable/strict. So, the way I understand the intend of this PR, we should rebuild the Docker image nightly from this PyTorch branch to identify the new issues, before we will start depending on a new PyTorch release, shouldn't we?

Thanks @valeriy42 ! Yes, the I believe the viable/strict branch is the most appropriate one for us to be building and testing against. Originally we had been targeting main, but that is too volatile. Our friends in QA suggested the latest release branch - 2.3.1, but as you've pointed out, that is pretty much static. So I think viable/strict is our "Goldilocks" branch - just right 🤞