Closed 99991 closed 4 months ago
Thank you to your suggestions!
heap
structure is appreciated. I'll be sure to incorporate it in the future.set_printoptions()
to disable scientific notation to make it more readable. I used it to debug the values, but I don't use it anymore, so I removed it.
You can improve the performance of the tokenizer by using Python's
heappush
andheappop
functions. For example, see this tokenizer for my simplified TinyLlama implementation: https://github.com/99991/SimpleTinyLlama/blob/9af6f7df6e12d8478a90d3cd5c8e8c1a95fce0fe/tokenizer.py#L96Instead of silencing warnings, you could clip the magnitude of the inputs to the sigmoid function as in my NumPy CLIP implementation: https://github.com/99991/NumPyCLIP/blob/main/numpyclip.py#L113-L116
I'd also recommend to store the weights somewhere else (for example as a release or on HuggingFace) and download them on demand, because GitHub has really low bandwidth quota.
I very much appreciate your shape annotations. They make understanding much easier. Great work!