[Segmentation fault] python3 torchchat.py export stories15M --dtype fp32 --quantize '{"embedding": {"bitwidth": 4, "groupsize":32}, "linear:a8w4dq": {"groupsize" : 256}}' --output-pte-path stories15M.pte

mikekgfb commented 1 month ago

https://github.com/pytorch/torchchat/actions/runs/9047866134/job/24860312456?pr=751

This is a launch blocker for torchchat because it causes a fail for users following the example commands in our docs.

  + python3 torchchat.py export stories15M --dtype fp32 --quantize '{"embedding": {"bitwidth": 4, "groupsize":32}, "linear:a8w4dq": {"groupsize" : 256}}' --output-pte-path stories15M.pte
  /opt/homebrew/Caskroom/miniconda/base/envs/test-quantization-mps-macos/lib/python3.10/site-packages/executorch/exir/emit/_emitter.py:1474: UserWarning: Mutation on a buffer in the model is detected. ExecuTorch assumes buffers that are mutated in the graph have a meaningless initial state, only the shape and dtype will be serialized.
    warnings.warn(
  Using device=cpu
  Loading model...
  Time to load model: 0.01 seconds
  Quantizing the model with: {'embedding': {'bitwidth': 4, 'groupsize': 32}, 'linear:a8w4dq': {'groupsize': 256}}
  Time to quantize model: 7.83 seconds
  Exporting model using ExecuTorch to /Users/ec2-user/runner/_work/torchchat/torchchat/pytorch/torchchat/stories15M.pte
  The methods are:  {'forward'}
  + python3 generate.py stories15M --pte-path stories15M.pte --prompt 'Hello my name is'
  [program.cpp:130] InternalConsistency verification requested but not available
  [method.cpp:939] Overriding output data pointer allocated by memory plan is not allowed.
  ./run-quantization.sh: line 27: 18269 Segmentation fault: 11  python3 generate.py stories15M --pte-path stories15M.pte --prompt "Hello my name is"
  Error: Process completed with exit code 1.

mikekgfb commented 1 month ago

Also https://github.com/pytorch/torchchat/actions/runs/9054732211/job/24874908070?pr=768

mergennachin commented 1 month ago

thanks for reporting.

@mikekgfb tried reproducing locally. but can't so far. is it reproducible for you consistently or happened randomly?

mikekgfb commented 1 month ago

Consistently reproducible both in ci and locally

mcr229 commented 1 month ago

I wonder if this is caused by the CI flow exporting to the same file name and there being some collision with multiple threads exporting to the same named .pte file. And when running the model, there was some corruption with the file causing segfault.

do you mind sharing the model artifact causing seg fault? Can help with jumpstarting the debug for this.

mikekgfb commented 4 weeks ago

I wonder if this is caused by the CI flow exporting to the same file name and there being some collision with multiple threads exporting to the same named .pte file. And when running the model, there was some corruption with the file causing segfault.

do you mind sharing the model artifact causing seg fault? Can help with jumpstarting the debug for this.

I don't think we use multithreading? That being said this works now.

pytorch / executorch

[Segmentation fault] python3 torchchat.py export stories15M --dtype fp32 --quantize '{"embedding": {"bitwidth": 4, "groupsize":32}, "linear:a8w4dq": {"groupsize" : 256}}' --output-pte-path stories15M.pte #3588