Gadersd / llama2-burn

Llama2 LLM ported to Rust burn
MIT License
272 stars 17 forks source link

Hardware requirements? #1

Open 9876691 opened 1 year ago

9876691 commented 1 year ago

Could we list the hardware requirements or the hardware it's been tested on at the moment?

So I can see it's using the WGPU backend do you know roughly how much VRAM is needed?

Thanks.

overheat commented 1 year ago

I have same question. And does it works on Apple Silicon, M1/M2?

nickgarfield commented 1 year ago

I'm currently getting No CPU device found on M2 when trying to run the convert binary. Issue seems related to the latest commit (c512786). Reverting to 427911f worked for me.

Update: Nevermind, I ran into a buffer size issue:

thread 'main' panicked at 'wgpu error: Validation Error

Caused by:
    In Device::create_buffer
      note: label = `Buffer Src`
    Buffer size 524288000 is greater than the maximum buffer size (268435456)

', /Users/nickgarfield/.cargo/registry/src/index.crates.io-6f17d22bba15001f/wgpu-0.17.0/src/backend/direct.rs:3056:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Gadersd commented 1 year ago

I'm currently getting No CPU device found on M2 when trying to run the convert binary.

I'll check with the burn team about this.

Once the burn caching functions are public I'll try benchmarking it.

Gadersd commented 1 year ago

burn-wgpu doesn't use the full device memory so overflows happen with some large models but hopefully in a few days I'll have it working.