skeskinen / bert.cpp

ggml implementation of BERT
MIT License
463 stars 58 forks source link

About the calculation of overhead. #19

Open znsoftm opened 1 year ago

znsoftm commented 1 year ago

https://github.com/ggerganov/ggml/issues/356

znsoftm commented 1 year ago

or BERT mode, its overhead is calculated as :

model_mem_req += (5 + 16 n_layer) 256; // object overhead

Can anyone explain the meaning 5 is extra tensors, 16 means each layer has 16 tensor, and 256 for what?

Is it the sizeof ggml_tensor struct ? The actual size is 208 bytes, so 256 is rounded size?

skeskinen commented 1 year ago

My memory is a little hazy on this subject. Like you said 5 should be the extra model wise tensors not tied to any layer. I think I tried smaller number than 256 for the size but it crashed with OOM. Probably the real size of C structs is always rounded up to the next power of 2?

znsoftm commented 1 year ago

thanks for your answer:)

znsoftm commented 1 year ago

I have tested the latest ggml, should alter the 256 to 512. Do not understand why:(