Need optimization for mps

AetherCortex / Llama-X

Open Academic Research on Improving LLaMA to SOTA LLM

Apache License 2.0

1.59k stars 101 forks source link

Open yxKryptonite opened 1 year ago

yxKryptonite commented 1 year ago

Reproduction info:

The inference speed is very slow on macOS machine with mps, which needs further optimization.