Closed feevos closed 4 years ago
Some additional information: it seems the error relates to how many times the initial input is passed from the conv layers. It is not directly related to the iteration over the HybridSequential container.
This works irrespective to what is the length of the kernel_sizes:
class Demo(HybridBlock):
def __init__(self, kernel_sizes = [3,3,3,3],**kwards):
super().__init__(**kwards)
with self.name_scope():
self.net = gluon.nn.HybridSequential()
for k in kernel_sizes:
tnet = gluon.nn.HybridSequential()
for _ in range(3):
tnet.add(gluon.nn.Conv2D(32,kernel_size=k,padding=1))
self.net.add(tnet)
def hybrid_forward(self, F, input):
x = input
for conv in self.net:
#x = x + conv(input)
x = x + conv(x) ## <===== CHANGE HERE
return x
Runs fine:
nfilters=32
F = 256
net = Demo(kernel_sizes=[3]*100)
net.initialize()
net.hybridize()
xx = nd.random.uniform(shape=[7,nfilters,F,F])
out = net(xx)
Workaround that solves the problem (at some computational cost, I guess...):
class Demo(HybridBlock):
def __init__(self, kernel_sizes = [3,3,3,3],**kwards):
super().__init__(**kwards)
with self.name_scope():
self.net = gluon.nn.HybridSequential()
for k in kernel_sizes:
tnet = gluon.nn.HybridSequential()
for _ in range(3):
tnet.add(gluon.nn.Conv2D(32,kernel_size=k,padding=1))
self.net.add(tnet)
def hybrid_forward(self, F, input):
x = input
for conv in self.net:
x = F.identity(x) # <====== CHANGE HERE
x = x + conv(input)
return x
@zachgk assign @szha
Dear all,
using the new version of mxnet (2.0) solves this problem:
In [1]: import mxnet as mx
...: from mxnet import nd, gluon
...: from mxnet.gluon import HybridBlock
...: from mxnet import np, npx
...: npx.set_np()
...: class Demo(HybridBlock):
...: def __init__(self, kernel_sizes = [3]*17,**kwards):
...: super().__init__(**kwards)
...:
...:
...: self.net = gluon.nn.HybridSequential()
...: for k in kernel_sizes:
...: self.net.add(gluon.nn.Conv2D(32,kernel_size=k,padding=1))
...:
...: def forward(self,input):
...: x = input
...: for conv in self.net:
...: x = x + conv(input)
...:
...: return x
...:
...: # This reproduces the error.
...: nfilters=32
...: F = 256
...:
...: net = Demo(kernel_sizes=[3]*7) # <=== CHANGE HERE, for length of list < 7 this script runs fine.
...: net.initialize()
...: net.hybridize()
...: xx = np.random.rand(7,nfilters,F,F)
...: out = net(xx)
/usr/local/lib/python3.7/site-packages/joblib/_multiprocessing_helpers.py:45: UserWarning: [Errno 28] No space left on device. joblib will operate in serial mode
warnings.warn('%s. joblib will operate in serial mode' % (e,))
[18:14:54] ../src/storage/storage.cc:199: Using Pooled (Naive) StorageManager for CPU
In [2]: out = net(xx)
In [3]: out.shape
Out[3]: (7, 32, 256, 256)
Description
Dear all, there is a bug when iterating over a HybridSequential treated as a container. This bug depends on the length of the container. If the length is small, the error does not appear. See minimal example below.
Error Message
MXNetError Traceback (most recent call last)