Open hpcpony opened 5 months ago
hi, @hpcpony, sorry for the confusion. The model_scratch is like a kind of kernel workspace and it will be used in model eval process. They are roughly estimated values. About how to set them when adding new models, my experience is that looking for a reference model with similar parameters. It will automatically enlarge these memories when meets larger bs and ctx_size if you use our python API. cc @Zhenzhong1 and @a32543254.
In the example for adding to gptneox_mem_req I see that n_layers comes from the num_hidden_layers in the config.json file, but where does the 512, 512, and 1024 come from? Maybe a comment in the document would help.
I was looking to extend the existing bloom capability to handle https://huggingface.co/bigscience/bloom but it's not obvious to me how chose the right scratch sizes from the config.json.