feat: Tutorial Module 1: Demo Open-Source LLMs using OLMo

atambay37 commented 4 months ago

Ask it several questions to show its capacity (e.g., ability to answer fact-based questions, when and where OLMo hallucinates, etc.)

lsetiawan commented 4 months ago

To speed up inference.. we've decided to use llama.cpp with the llama-cpp-python Python Binding. However, this requires OLMo to be in GGUF file format and quantized to 4-bits to speed up the inference. Currently some of the base models in GGUF format is available at https://huggingface.co/collections/nopperl/olmo-gguf-66211a0071b6c3d66303fcf1.

@anantmittal is exploring to see if we can convert and quantize the OLMo-7B-Instruct since that has been tuned better for our usecase.

vanitech commented 4 months ago

The Google colab listed under https://github.com/uw-ssec/tutorials/issues/20 has one writeup of the code part of this step. The final version will look a bit different as we will use a different / faster model.

anantmittal commented 4 months ago

To speed up inference.. we've decided to use llama.cpp with the llama-cpp-python Python Binding. However, this requires OLMo to be in GGUF file format and quantized to 4-bits to speed up the inference. Currently some of the base models in GGUF format is available at https://huggingface.co/collections/nopperl/olmo-gguf-66211a0071b6c3d66303fcf1.

@anantmittal is exploring to see if we can convert and quantize the OLMo-7B-Instruct since that has been tuned better for our usecase.

Added useful links here: https://github.com/uw-ssec/tutorials/discussions/52

OLMo team hasn't released the checkpoints for older instruct models, but will release 1.7 instruct is being trained, and its checkpoints will be released: https://github.com/allenai/OLMo/issues/553

vanitech commented 3 months ago

Addressed in https://github.com/uw-ssec/tutorials/blob/main/SciPy2024/module1/llms-and-prompt-engineering-with-olmo.ipynb

uw-ssec / tutorials

feat: Tutorial Module 1: Demo Open-Source LLMs using OLMo #39