likejazz / llama3.np

llama3.np is a pure NumPy implementation for Llama 3 model.
MIT License
958 stars 73 forks source link

Omitted s' #3

Open WaitingOak opened 4 months ago

WaitingOak commented 4 months ago

I've only done a simple test but if you remove the whole line 65 from file tokenizer.py then the 's' at the end of was and his don't get stripped as it were. This line below: text = text.strip("<s>").strip("</s>")