gemma 7B model configuration

Hello,

I am inquiring about the model configuration outlined in your technical report.

In the technical report regarding 'Gemma', the 7B model specifies 'd_model' as 3072 in table 1.

I understand 'd_model' to represent the 'hidden size', which should be equivalent to 'Num heads Head size'. I was confused because 'Num heads Head size' equals 4096, while 'd_model' is listed as 3072. Could you clarify the meaning of 'd_model' and provide the correct 'hidden size' for the Gemma 7B model?

Thank you.

google-deepmind / gemma

gemma 7B model configuration #4