-
win7 SP1 ,GTX1080 , VS2013, cuda8.0 + cudnn7.1 consumes GPU memory and memory more than cuda8.0 + cudnn5.1 to test caffeNet , Why? THX
-
Hi everyone!
@yusuketomoto first of all thank you for amazing tool for style transfer! I've experimented with it a lot, and could obtain nice results. But still always one thing appears I'd like to g…
-
In the README you mention the intention to compare Temporal Fusion Transformers to the SeriesNet. What did you find in terms of training time and accuracy on the model?
While I am familiar with Wa…
-
**Describe the bug**
Method mlx.core.conv_general is significantly slower than PyTorch analog. Can vary from 10x to 150x slower.
**To Reproduce**
Just run the attached code.
Include code snipp…
-
I noticed that doing a simple 2D convolution using Jax's scipy backend is significantly slower than using scipy itself:
```python
import numpy as np
import jax.scipy.signal
import scipy.signal
…
-
So, I've been toying with local conditioning lately and while I can see quite easily how it would be implemented for slow generation, I can't really wrap my head around how it would work with fast gen…
-
Hi,
I have trained three ssd models, using vgg16_reduced, mobilenet_512 and mobilenet_608. After that I am running inference on a video, using batches of a single frame and comparing the speed of t…
-
i try to use group convolution to speed up the operation,this is my cfg file,this can train ,but When I tested, there was no output。hope to get everyone's help
I added a group to each set of convol…
-
#SqueezeNet New Version
Institute: UC Berkeley
URL: https://arxiv.org/pdf/1803.10615.pdf
Code: https://github.com/amirgholami/SqueezeNext
https://github.com/luuuyi/SqueezeNext.PyTo…
-
How about use [this](https://en.m.wikipedia.org/wiki/Multiplication_algorithm#Complex_multiplication_algorithm) for naive convolution and reduce 4 convs down to 3