Free Colab - Githubissues

rarhs commented 4 months ago

Can't run on free colab due to not having adequate RAM.

AbnerAI commented 4 months ago

How much CPU/GPU resoures are required?

wdndev commented 4 months ago

Hello, I have an idea.

Because the model is loaded by the CPU, I am using my notebook (16G RAM) running and cannot load the Llama3-8B model.

So, I took the first two Transformers layers from the 32-layers architecture of the Llama3-8B model to form a new model. This can be run in a notebook with 16G RAM, occupying about 4~5G RAM, but the final decoding result is wrong, and the middle result is all correct.

You can try it with this model.

Haggingface link: https://huggingface.co/wdndev/Meta-Llama-3-8B-Instruct-2layers
ModeScope link: https://www.modelscope.cn/models/wdndev/Meta-Llama-3-8B-Instruct-2layers

and the colab link, it can run directly:

llama3-from-scratch-en: https://colab.research.google.com/drive/1X9yEa4hAZzgrwTuxHValBoVt1qfx6AXv?usp=sharing
llama3-from-scratch-zh: https://colab.research.google.com/drive/11MQb8Bn4Ck707VEcqqGVdytqOk3OrQQK?usp=sharing

naklecha / llama3-from-scratch

Free Colab #12