-
When run example.py, hit
RuntimeError: CUDA error: no kernel image is available for execution on the device (at /home/ubuntu/nunchaku/src/kernels/awq/gemv_awq.cu:311)
I'm on Lambda A100 GPU insta…
-
### 🐛 Describe the bug
I'm working on a MPS implementation of torchvision.ops.deform_conv2d and running into problems when testing the op with
```
pytest test/test_ops.py::TestDeformConv -vv -s …
-
Hello! Thank you for your work!
I have conducted a version of fine tuning training on databricks-dolly-15k according to your script Settings. When I tried to evaluate mmlu on opencompass, I found tha…
-
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)
-
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
-
作者您好,我尝试使用单张4090 24GB在s3dis上进行训练,使用的是semseg-pt-v3m1-1-rpe.py,因为显存问题无法正常训练,所以我设置batch_size = 1以及GridSample中验证和测试的grid_size设置为0.04。
当我训练到
[2024-11-19 17:03:06,782 INFO misc.py line 118 6912] Train…
-
### Python -VV
```shell
Python 3.11.9 (main, Apr 19 2024, 16:48:06) [GCC 11.2.0]
```
### Pip Freeze
conda env export
```shell
name: test
channels:
- defaults
dependencies:
-…
-
How to use multi-gpu quantization
when i operate :
export MODEL_NAME_OR_PATH="/llama2_70B"
export OUTPUT_DIR="/llama2_70B_quantize"
export QUANTIZATION_SCHEME="fp8"
export DEVICE="cuda:0,1,2,3,4"…
-
I'm currently working on a CUDA project, and want clang-format to break after CUDA-specific attributes like `__device__`, `__global__`, etc. I attempted the following configuration, but it doesn't see…
-
### System Info
- GPU: NVIDIA H100 80G
- TensorRT-LLM branch main
- TensorRT-LLM commit: 535c9cc6730f5ac999e4b1cb621402b58138f819
### Who can help?
@byshiue @Superjomn
### Information
- [x] The…