ReaLLMASIC / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.
MIT License
23 stars 17 forks source link

golden gen for decoder #194

Closed Buck008 closed 1 month ago

Buck008 commented 2 months ago

I've used Numpy and referenced code from NanoGPT to write the decoder forward dataflow based on my understanding. Currently, all values are floating-point numbers. (All weights are randomly generated, and I haven't implemented the dropout function because I believe it won't be executed in hardware.) I would greatly appreciate it if someone could help me check whether my understanding is correct. (Please excuse my limited Python skills.) If this is right, I am going to write the integer version,which will be the golden brick generator.

Buck008 commented 2 months ago

Sure, I will do all these above. I am not familar with embedding for nano gpt now, I may need time to see that. And the embeding layer now is not supported in hardware.

Buck008 commented 1 month ago

Sorry I don't understand how the embedding layer do in the network, so I just create the two matrixes. And for the output layer, do you mean a layernorm or RMSnorm? For hardware, we are just going to support attention and mlp. Sorry for the late update, this week happened too many things. : )

gkielian commented 1 month ago

Sure, I will do all these above. I am not familar with embedding for nano gpt now, I may need time to see that. And the embeding layer now is not supported in hardware.

Good to know, I think we're closing in on a solution for the embedding layer for hardware too