Training GPT first time, Encountering Error Opening Model file and error reading model header

albertpurnama commented 2 months ago

Hi,

Thanks for doing all the initial work to port over functionalities from llm.c to Go. I'm not familiar with model training in general but I have built a couple of web servers using Go before and I'd love to contribute to the llm.go project.

I'm encountering issue where make train causes the error:

$ make train     
go run ./cmd/traingpt2
2024/05/04 11:33:40 Error opening model file: open ./gpt2_124M.bin: no such file or directory
exit status 1
make: *** [train] Error 1

It's mainly because the gpt2_124M.bin is not available. Even when I try to touch the model binary by touch ./gpt2_124M.bin I encounter more errors regarding file headers. Like so:

go run ./cmd/traingpt2
2024/05/04 11:36:45 error reading model header: EOF
exit status 1
make: *** [train] Error 1

How can I resolve this?

Thanks in advance!

albertpurnama commented 2 months ago

After reading a couple more issues from original llm.c repo. seems like generating initial model weights is what we need.

https://github.com/karpathy/llm.c/pull/288/files

Basically we need to run the script to generate the initial file checkpoint.

albertpurnama commented 2 months ago

I figured it out.

What you need to do is to run python train_gpt2.py first. this way it will create the following files:

gpt2_124M.bin
gpt2_124M_debug_state.bin
gpt2_tokenizer.bin

I found out that tokenizer was created by https://github.com/albertpurnama/llm.go/blob/ffb034ebbd7792f1f1a9ba6766e5a940bc9084e8/train_gpt2.py#L346-L352

joshcarp commented 2 months ago

Hey, didn't see this until now, but I think this presents a nice opportunity to completely remove any of python from the repo. It would be nice to download directly from huggingface and not need any of the llm.c binary files

joshcarp / llm.go

Training GPT first time, Encountering Error Opening Model file and error reading model header #1