Closed YrrepNoj closed 7 months ago
Not sure if you wanted us to review this yet (requested), but it seems the E2E ran into runner disk space errors. Might be a good idea to run UDS' runner setup/cleanup step, and also use the build args to identify a smaller model for inferencing (e.g. phi-2 dolphin data set fine tune q4).
Closing this PR as stale. Additionally, this repo will likely be archived soon.
This PR introduces an e2e test that verifies the llama.cpp backend can be deployed onto a UDS cluster with LFAI-API and functions as expected.