-
I've written a function to pack a 2D int4 tensor into a `int32` tensor. I want use `torch.compile` to speed up it, but the compilation process takes over 200 seconds.
It's not a bug, but I wonder …
-
### System Info
- `transformers` version: 4.44.2
- Platform: Linux-3.10.0-1160.el7.x86_64-x86_64-with-glibc2.17
- Python version: 3.10.14
- Huggingface_hub version: 0.24.3
- Safetensors version: …
-
If one simply runs for the first time in a fresh new server:
```sh
python3 -m venv --upgrade-deps venv
source venv/bin/activate
pip cache remove llama_cpp_python
pip install instructlab -C cmake.…
-
### Your current environment
在A800(80G显存) 2卡机器上启动两个qwen-14B的模型,一张卡上一个模型,第一个模型启动正常,但是在启动第二个模型的时候,vllm版本是0.3.3
### 🐛 Describe the bug
WARNING 03-29 18:28:18 tokenizer.py:64] Using a slow tokeni…
-
### 🐛 Describe the bug
If Torch 2.1.0 is used as a dependency with [Bazel](https://bazel.build/) and [rules_python](https://rules-python.readthedocs.io/), `_preload_cuda_deps` fails with `OSError: …
-
Hi, I do not use your scripts but we are seeing users in the Looking Glass discord that are who are having latency related issues due to how your script assigns CPUs to the VM.
The issue is that yo…
-
### Your current environment
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubunt…
czhcc updated
2 months ago
-
We currently try to open /sys/devices/system/cpu/cpuX/cache/indexY/shared_cpu_map for every PU and Y between 0 and 9. That's usually 6 useless syscalls per PU since most CPUs have 4 caches per PU. Tha…
-
运行python main_debug.py
出现Using cache found in C:\Users\13939/.cache\torch\hub\ultralytics_yolov5_master
YOLOv5 2023-11-14 Python-3.8.18 torch-2.1.0+cpu CPU
Fusing layers...
YOLOv5l6 summary: 47…
-
nerfstudio 1.1.2, gsplat 1.0.0
ns-train splatfacto ... --downscale-factor 1...
the dataset with 1116 4K(3840x2160) images
Traceback (most recent call last):
File "/home/ubuntu/.local/bin/…