tairov / llama2.mojo

Inference Llama 2 in one file of pure 🔥
https://www.modular.com/blog/community-spotlight-how-i-built-llama2-by-aydyn-tairov
MIT License
2.09k stars 140 forks source link

remove overwrite of vocab_scores #25

Closed mikowals closed 12 months ago

mikowals commented 12 months ago

tok.vocab_scores = buf.data.offset(buf.offset).bitcast[DType.float32]() appears to be overwriting the scores with junk data. buf.offset is expected to be at the end of the buffer at that point because all data has been read with the offset advanced by each read.

The tokenizer.bin file format can be found here.

tairov commented 12 months ago

Hi @mikowals , yes, scores are pointing to a wrong place in mem. But I can't understand how the tokens are working properly then :)

Also the next line is redundant. could you please remove next line as well?

tairov commented 12 months ago

@mikowals thank you !

mikowals commented 12 months ago

I removed the offset incrementing line as requested.

I don't think the scores are working as well as they could. I think most of them just go unused. Karpathy has some comments in his repo about getting good results only using 4096 tokens.

I found this problem because I was sorting the vocab and scores to do binary search and began getting signal SIGSEGV: invalid address (fault address: 0x0). So I think there were problems they were just subtle to detect without looping and reading all the scores.

tairov commented 12 months ago

That's good catch!