cuda-runtime-api Search Results

1000+ results
for cuda-runtime-api

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #32167

Support in-place pinning of memory

## 🚀 Feature Provide a `pin_memory_` method on tensors (note the trailing underscore) which operates in-place. ## Motivation Pinning memory using the current `pin_memory` method creates a copy of…

lw updated 3 months ago
4
SciSharp/LLamaSharp #785

[BUG]: Cannot load the backend on MACOS

### Description When running examples on my MAC notebook, it got the correct path of the native library but failed to load it. Here's the output of `uname -a` of my notebook. ``` Darwin U-0R7T…

AsakusaRinne updated 4 months ago
2
microsoft/onnxruntime #19502

Not all CUDA operators support bfloat16 that should

### Describe the issue Several operators that should support bfloat16 do not do so with the CUDA execution provider. This was noticed with `ReduceMean`, but here is the complete list: `Abs`, `ArgM…

borg323 updated 7 months ago
5
pmh47/dirt #118

Unable to compile

When I run `cmake ../crsc` Heres the output ; ``` CMake Error at /home/zen/anaconda3/envs/tf/share/cmake-3.26/Modules/CMakeDetermineCompilerId.cmake:751 (message): Compiling the CUDA compil…

BukuBukuChagma updated 7 months ago
1
sail-sg/zero-bubble-pipeline-parallelism #18

[QUESTION] May I ask what tool was used to plot Figure 6 in …

**Your question** How can I profile bubble time in pipeline parallelism?

starstream updated 2 months ago
3
facebookresearch/faiss #656

cudaStream_t arguments in Python client

The Python SWIG client exposes the setDefaultStream() function on the GPU resources object but it does not seem to provide any type conversion options to pass in a Python equivalent to the C++ cudastr…

cjnolet updated 4 months ago
4
microsoft/onnxruntime #20297

[Performance] ScatterND / GridSample operators are on CPU in…

### Describe the issue We exported the Huggingface transformer model [OneFormer](https://huggingface.co/docs/transformers/model_doc/oneformer) into onnx. Opset 20 failed with the error: ``` O…

tikr7 updated 5 months ago
4
exo-explore/exo #155

Having trouble running on ubuntu linux with 4090 (cuda 12.2)

Tried tensorflow and torch with tinygrad still getting this error with llama 8b 3.1 and llama 8b as well. Apparently this is an opencl compile error for bfloat16 data type Sorry, I am not a kernel …

ctisme updated 2 weeks ago
6
vllm-project/vllm #8852

[Installation]: Meet bugs when installing from source

### Your current environment ```text Collecting environment information... PyTorch version: 2.4.0 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ub…

htlou updated 1 month ago
2
InternLM/lmdeploy #2624

[Bug] 对InternVL模型进行推理时，图像编码阶段gpu-cpu的传输时间过长

### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [x] 2. The bug has not been fixed in the latest version. - [x] 3. Please note that if the bug-related iss…

Dimensionzw updated 3 weeks ago
2

上一页 1...54 55 56 57 58 59 60...100 下一页

1000+ results for cuda-runtime-api

1000+ results
for cuda-runtime-api