-
**Describe the bug**
The arguments for the function `scipy.sparse.linalg.cg` changed. `tol` was deprecated and got replaced by `rtol` in version 1.14.0 [[1](https://docs.scipy.org/doc/scipy-1.13.1/re…
-
### Your current environment
First of all: fantastic project :-) Thank you for everything.
I would like to fix this bug. But I just do not have the capacity now. So I just thought I would try to m…
-
I have some code that launches multiple kernels and distributes them on multiple queues which are for different CUDA devices. When only 1 gpu is used, we get the following dependency graph:
![dep_gra…
-
Hello,
Similarly to #3, I've tried reproducing the `demo.py` benchmark on an H100 and an A6000 and I'm also seeing no speedup on these platforms at lower precisions.
It was mentioned this is du…
-
I was just thinking about this idea, so writing it down for future research.
We should be able to fairly easy generate model-specific Metal code that has hardcoded kernels for every single node in …
-
Hi!
I am getting a bunch of linker errors, appearing to be circular imports. I am using TFLM_ESP32 v2.0 and EloquentTinyML 3.0.1
Thanks in advance.
Linking everything together...
/home/alvaro…
-
### Motivation.
At a high level, we at Neural Magic are writing a custom compiler for Torch Dynamo to define a system within vLLM where we can write graph transformations. The main goal is a separa…
-
## Current issue
- [ ] Infeasible to merge multiple views
- [ ] Cannot support multiple views (e.g. [A, B] -> View -> Shape cannot be inferenced)
- [ ] Parallelize Attention QKV Projection
…
-
```bash
# env CMAKE_BUILD_PARALLEL_LEVEL="" pip install . -v
```
Includes in the output:
```
/Users/user/Documents/AI/mlx/mlx/mlx/backend/accelerate/matmul.cpp:109:9: warning: 'BNNSLayerParame…
-
In PyTorch, we know that Torch.Compile will bring us a lot of benefits, and the TransformerEngine also brings performance improvements through strategies such as Transformer fusion optimization, so do…