memory issue during export_llama?

antmikinka commented 2 weeks ago

I was following through on the llama pages for this repo. I do have a 8GB macbook, so I do not know if this is the issue. My ram did not skyrocket and it never said "ran out of ram". So, I don't think its a ram issue.

script to reproduce:

python -m examples.models.llama2.export_llama -kv --coreml -c stories110M.pt -p params.json

yes I ran and built coreml frameworks and dependencies on 2.0 rc5

Running MIL default pipeline: 100%|██████████████████████████████████████████████| 78/78 [00:07<00:00, 10.37 passes/s]
Running MIL backend_mlprogram pipeline: 100%|████████████████████████████████████| 12/12 [00:00<00:00, 54.97 passes/s]
/opt/anaconda3/envs/executorch/lib/python3.10/site-packages/executorch/exir/emit/_emitter.py:1316: UserWarning: Mutation on a buffer in the model is detected. ExecuTorch assumes buffers that are mutated in the graph have a meaningless initial state, only the shape and dtype will be serialized.
  warnings.warn(
INFO:root:Required memory for activation in bytes: [0, 19002368]
INFO:root:Saved exported program to ./coreml_llama2.pte

cccclai commented 2 weeks ago

The log seems expected - is there any log that looks confusing?

antmikinka commented 2 weeks ago

@cccclai The only thing that was confusing was it stating the "Required memory for activation in bytes: [0, 19002368]" I wasn't sure if the ./coreml_llama2.pte file was complete or not.

cccclai commented 2 weeks ago

Oh that was completed - "Required memory for activation in bytes: [0, 19002368]" means that, in addition to the model's weight, we need 19002368 extra memory for the activation when we run the model on device.

DawerG commented 2 weeks ago

@antmikinka Is the issue resolved? If not, can you please summarize what else is needed? Thanks.

larryliu0820 commented 1 week ago

I think @antmikinka was able to finish exporting, if not please file another issue. Closing.

pytorch / executorch

memory issue during export_llama? #3480