Open znsoftm opened 1 year ago
or BERT mode, its overhead is calculated as :
model_mem_req += (5 + 16 n_layer) 256; // object overhead
Can anyone explain the meaning 5 is extra tensors, 16 means each layer has 16 tensor, and 256 for what?
Is it the sizeof ggml_tensor struct ? The actual size is 208 bytes, so 256 is rounded size?
My memory is a little hazy on this subject. Like you said 5 should be the extra model wise tensors not tied to any layer. I think I tried smaller number than 256 for the size but it crashed with OOM. Probably the real size of C structs is always rounded up to the next power of 2?
thanks for your answer:)
I have tested the latest ggml, should alter the 256 to 512. Do not understand why:(
https://github.com/ggerganov/ggml/issues/356