-
## Description
when I try to build pytorch_quantization from source following readme.md, building error happens in tensor_quant_gpu.cu. and I think maybe some version problem
## Environment
**T…
-
**Which service is this feature request for?**
Red Hat OpenShift Service on AWS
https://aws.amazon.com/about-aws/whats-new/2024/04/general-availability-amazon-ec2-g6-instances/
https://aws.amazon.…
-
**What is your question?**
I am currently running [PySpark data processing jobs](https://github.com/KubedAI/spark-rapids-on-kubernetes/blob/main/examples/benchmarks/benchmark/spark-rapids-benchmarks.…
-
## Environment
Running on docker instance with github checkout of streaming "3a6a5490678a2efa028ed96ba9b8813fba8687eb"
- OS: [Ubuntu 20.04]
- Hardware (GPU, or instance type): [A100]
## To repro…
-
GPU instance types tend to require additional software to take advantage of the GPUs. Much like the GPU AMIs available for AL2, we want them for AL2022.
-
I am aiming to develop non-gaming 3D visualization applications, an industrial simulation tools, leveraging WPF and other UI frameworks. Currently, the open-source software options are quite limited (…
wdc63 updated
12 months ago
-
**Is your feature request related to a problem? Please describe.**
CUDA Heterogeneous Memory Management (HMM) enables GPU code to access all data allocated by a process.
That is, users do no longer …
-
### 🚀 The feature, motivation and pitch
It is common to have a scenario where folks want to deploy multiple vLLM instances on a single machine due to the machine have several GPUs (commonly 8 GPUs). …
-
### Your current environment
python: 3.8
cuda: 11.8
vllm: 0.5.5+cu118
### Model Input Dumps
_No response_
### 🐛 Describe the bug
my llm model is qwen2 1.5b,so i want to initialize multiple wor…
-
Hi, I have a glb file utilizing the EXT_mesh_gpu_instancing and EXT_instance_features extensions
See
https://github.com/KhronosGroup/glTF/blob/main/extensions/2.0/Vendor/EXT_mesh_gpu_instancing/R…