A few suggestions - Githubissues

99991 commented 4 months ago

You can improve the performance of the tokenizer by using Python's heappush and heappop functions. For example, see this tokenizer for my simplified TinyLlama implementation: https://github.com/99991/SimpleTinyLlama/blob/9af6f7df6e12d8478a90d3cd5c8e8c1a95fce0fe/tokenizer.py#L96

Instead of silencing warnings, you could clip the magnitude of the inputs to the sigmoid function as in my NumPy CLIP implementation: https://github.com/99991/NumPyCLIP/blob/main/numpyclip.py#L113-L116

I'd also recommend to store the weights somewhere else (for example as a release or on HuggingFace) and download them on demand, because GitHub has really low bandwidth quota.

I very much appreciate your shape annotations. They make understanding much easier. Great work!

likejazz commented 4 months ago

Thank you to your suggestions!

This implementation is not focused on performance, but your suggestion of a heap structure is appreciated. I'll be sure to incorporate it in the future.
I used to set_printoptions() to disable scientific notation to make it more readable. I used it to debug the values, but I don't use it anymore, so I removed it.
I uploaded the model directly to GitHub for convenience. The file size was 85MB, so it wasn't too big. But in general, your suggestion is correct.

99991 commented 4 months ago

Oh, you are right. I was mistaken and thought the code said something different. Nevermind!
I had a similarly large project in the past and GitHub would frequently block downloads whenever the quota of just 5 GB was exceeded. But I agree, it is more convenient this way.

likejazz / llama3.np

A few suggestions #1