Closed warner-benjamin closed 1 week ago
Add get_number_parameters, which by default follows Karpathy's nanoGPT and counts the token embeddings, but exclude non-trainable parameters, absolute positional embeddings, and the MLM head.
get_number_parameters
Add
get_number_parameters
, which by default follows Karpathy's nanoGPT and counts the token embeddings, but exclude non-trainable parameters, absolute positional embeddings, and the MLM head.