-
### Description
If I receive an out of memory exception from cupy, then I am unable to clear the memory pool using the free_all_blocks() method. The code below demonstrates this issue by performin…
-
Scikit-learn and SciPy have been incrementally adding support for the [Array API standard](https://data-apis.org/array-api/latest/). The Array API standard enables libraries to write code in a standar…
-
We need to record performance of each version.
-
Hi Guys. I have built a python quantitive investment algorithm in python (mostly cupy). It is quite long and complex, therefore, when even though it is fully vectorized, I need to use for loops to avo…
-
For the performance, `_AbstractReductionKernel` should use cuTENSOR by default if `cupy.cuda.cutensor_enabled` is `True`.
-
Currently, we list functions that support the array API on https://scipy.github.io/devdocs/dev/api-dev/array_api.html#currently-supported-functionality . Initially, while support was very limited, thi…
-
Related to https://github.com/ut-parla/parla-experimental/issues/62. Probable same cause.
```
from parla import Parla, spawn, TaskSpace
from parla.common.globals import get_current_context
from…
-
**Describe the bug**
The cucim.skimage.transform.PiecewiseAffineTransform seems to be several times slower than the scikit-image equivalent
**Steps/Code to reproduce bug**
When running the code b…
-
Andrej Karpathy has just ~upstaged me~ released llm.c which contains some highly optimised CUDA kernels. If we include these into tricycle, we can probably get a significant performance boost for oper…
-
Based on [discussion starting here]( https://github.com/AnacondaRecipes/cudatoolkit-feedstock/pull/5#issuecomment-603415108 ), it appears `cudatoolkit` is not getting picked up. This could easily be f…