-
**Description**
Implementing BLS in python backend to send in-flight inference request to another model using c_python_backend_utils.InferenceRequest() and passing in a list of c_python_backend_uti…
-
### Version
nvidia-dali-cuda120==1.40.0, jax==0.4.31
### Describe the bug.
We recently discovered a problem that when we used DALI, our training curves were mysteriously worse. We were able to fix …
-
Hello, a memory leak was detected when executing this code. The code was run on Python 3.10., triton-client 2.41.1, torch 2.1.2.
```python
import torch
import tritonclient.utils.cuda_shared_memory…
-
A small example:
```
julia> v = rand(3, 2)
3×2 Matrix{Float64}:
0.071368 0.486031
0.00750569 0.53865
0.416978 0.316323
julia> sub_v = @views v[2:3,:] # which is a StridedArray
2×…
-
**Description**
infer_request.exec() run slowly
**Triton Information**
nvcr.io/nvidia/tritonserver:24.05-py3
**To Reproduce**
```shell
/usr/src/tensorrt/bin/trtexec --onnx=test_static.onnx -…
-
Following up on #57 where we figured out the correct stream exchange and synchronization semantics and a Python interface for doing so, we need to do the same for a C interface.
TLDR from #57:
- F…
-
We allow a separate device field to be specified even when initializing Tripy Tensor with a DLPack object. Since Tripy Tensor must not perform any implicit copies, we assert if the device does not con…
-
There are two tasks:
- [x] Update to DLPack v0.8 (the last compatible version before API/ABI break): #7307
- [ ] Update to DLPack v1.0 (WIP: https://github.com/dmlc/dlpack/pull/113)
I think we s…
-
System: WSL2 Ubuntu 22.04, on top of Windows 11
CPU: 1270P
GPU: integrated (`[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Graphics [0x46a6] 1.3 [1.3.26032]`)
Tensorflow: 2.12.0
Jax:…
-
### Discussed in https://github.com/zarr-developers/zarr-python/discussions/2197
Originally posted by **ilan-gold** September 17, 2024
Hello all,
I have been looking into the core code here…