Open albertpurnama opened 2 months ago
After reading a couple more issues from original llm.c repo. seems like generating initial model weights is what we need.
https://github.com/karpathy/llm.c/pull/288/files
Basically we need to run the script to generate the initial file checkpoint.
I figured it out.
What you need to do is to run python train_gpt2.py
first. this way it will create the following files:
gpt2_124M.bin
gpt2_124M_debug_state.bin
gpt2_tokenizer.bin
I found out that tokenizer was created by https://github.com/albertpurnama/llm.go/blob/ffb034ebbd7792f1f1a9ba6766e5a940bc9084e8/train_gpt2.py#L346-L352
Hey, didn't see this until now, but I think this presents a nice opportunity to completely remove any of python from the repo. It would be nice to download directly from huggingface and not need any of the llm.c binary files
Hi,
Thanks for doing all the initial work to port over functionalities from llm.c to Go. I'm not familiar with model training in general but I have built a couple of web servers using Go before and I'd love to contribute to the llm.go project.
I'm encountering issue where
make train
causes the error:It's mainly because the gpt2_124M.bin is not available. Even when I try to touch the model binary by
touch ./gpt2_124M.bin
I encounter more errors regarding file headers. Like so:How can I resolve this?
Thanks in advance!