Mozilla-Ocho / llamafile

Distribute and run LLMs with a single file.
https://llamafile.ai
Other
16.75k stars 830 forks source link

github: add ci #454

Closed mofosyne closed 1 month ago

mofosyne commented 1 month ago

Add back in github action CI. However main aim is to try and reduce the complexity of the test to just a sanity check as llama.cpp can be relatively assumed to be battle tested (considering their extensive CI).

Also instead of downloading a super large gguf model to test the engine, let's just use this tiny 10MB model that I've converted from https://huggingface.co/Maykeye/TinyLLama-v0 into gguf in https://huggingface.co/mofosyne/TinyLLama-v0-5M-F16-llamafile. I've placed this test gguf file to the models folder so that it is faster for github actions to access.

I've tested this flow via the act runner, but found 'make -j8' to not function correctly under my pc test environment, but will try activating -j8 later.

Also adding a CI badge to provide immediate feedback if the main branch is compiling.

mofosyne commented 1 month ago

Credit to @ahgamut for pointers to figure out what's going on in github actions environment.

Still unsure why ape loader is needed if running in a linux context as cosmopolitant in theory should work in linux without needing a loader? Works on my PC or my local docker, but anything github action cloud or docker seems to need this extra process. I can see why you don't want to touch ci in this context.

Hopefully I've figured out a robust approach now and it's relatively quick due to the small gguf test model and the minimum steps needed to setup this sanity checking CI. Again, I don't intend this CI to be extensive but is simply there to catch obvious compilation errors.

mofosyne commented 1 month ago

Attempted to add caching for 'Setup cosmocc and ape loader'. Turns out to be a bit of a waste of time.

Saved 5s of downloading and inflating, looks like github actions internet connections seems rather decent. Anyway it's there.

To test the cache, I thew in a useful dummy commit of a refactor tracking ticket template. I think I've done all I could now to make it as robust as possible.

Its ready now @jart


edit: Actually one positive thing about caching this, is it's good etiquette for your file server... shouldn't keep redownloading the same thing each time.