-
### Description of bug
We migrated our Jenkins server to an r7i.2xlarge instance (8 vCPU, 64 GiB, DDR5 memory). One of our Jenkins jobs includes building spades from source. Here is an excerpts of th…
-
install error:
```
Cloning https://github.com/huggingface/accelerate.git (to revision test-clear-memory-cpu-offload) to /home/volker/pinokio/cache/TMPDIR/pip-req-build-as5l700j
Running comman…
-
I am a new user of LLNL/LEAP. And I encountered the speed issue. I use it for cone beam CT reconstruction of 2048 projections with image size 3072 * 3072, the output volume size is 3072 * 3072 * 307…
-
**Describe the bug**
Awq model export for cpu is not supported
**To Reproduce**
python3 builder.py -i awq_model_dir -o output_folder -p int4 -e cpu -c cache_dir
**Screenshots**
![屏幕截图 2024-08…
-
(python3-venv) aarch64_sh ~> cm run script --tags=run-mlperf,inference,_find-performance,_full,_r4.1 --model=dlrm_v2-99 --implementation=reference --framework=pytorch --category=datacenter…
-
CI: https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/4081#01918cdc-edef-40fc-9a36-c1ec173e5a63
Platform: multiple
Logs:
```
ERROR: [0mTraceback (most recent call last):
Er…
-
### 🐛 Describe the bug
When compiling flex_attention, any backwards operation crashes with the error `BypassFxGraphCache: Can't cache HigherOrderOperators.`.
Without compile its fine, but slow. I …
-
在opensora/serve/gradio_web_server.py 里引用了
`text_encoder = MT5EncoderModel.from_pretrained("/storage/ongoing/new/Open-Sora-Plan/cache_dir/mt5-xxl", cache_dir=args.cache_dir,
…
-
Hello, I don't know if I should post the Issue here or on the LiteX repository, after running the following command I get this terminal output:
```
python3 -m litex_boards.targets.radiona_ulx3s --…
-
### What is the issue?
I offloaded 47 out of 127 layers of Llama 3.1 405b q2 on an M3 Max with 64GB of RAM.
When I run the inference, the memory usage shows only about 8GB, while the cached memory…