Closed chrishkchris closed 4 years ago
A highlight is why the reshaped tensor have different shape? see test_reshape_gpu
Then, other test failure should be due to very small numerical errors (order of 1e-5) that can be fixed by reducing the number of significant in comparison.
should we make travis build fail when encountering errors raised from python unit test?
should we make travis build fail when encountering errors raised from python unit test?
In my opinion, this is a very good feature. However, I am not sure if the machine that runs the test case by travis has GPU. On the other hand, this test_operation.py is still important because it lets the developers to check whether the system has any problem after their commits.
If I am correct, the reshape is due to the error in backward:
class Reshape(Operation):
def __init__(self,shape):
super(Reshape, self).__init__()
if isinstance(shape, tensor.Tensor):
self.shape = np.asarray(tensor.to_numpy(shape).astype(np.int32)).tolist()
else:
self.shape = list(shape)
def forward(self, x):
_shape = x.shape()
shape = self.shape
# handle the shape with 0
shape = [_shape[i] if i < len(_shape) and shape[i] == 0 else shape[i] for i in range(len(shape))]
# handle the shape with -1
hidden_shape = int(np.prod(_shape) // np.abs(np.prod(shape)))
self.cache=[s if s != -1 else hidden_shape for s in shape]
return singa.Reshape(x, self.cache)
def backward(self, dy):
return singa.Reshape(dy, self.cache)
I think the function should change to
class Reshape(Operation):
def __init__(self,shape):
super(Reshape, self).__init__()
if isinstance(shape, tensor.Tensor):
self.shape = np.asarray(tensor.to_numpy(shape).astype(np.int32)).tolist()
else:
self.shape = list(shape)
def forward(self, x):
self._shape = x.shape()
shape = self.shape
# handle the shape with 0
shape = [self._shape[i] if i < len(self._shape) and shape[i] == 0 else shape[i] for i in range(len(shape))]
# handle the shape with -1
hidden_shape = int(np.prod(self._shape) // np.abs(np.prod(shape)))
self.cache=[s if s != -1 else hidden_shape for s in shape]
return singa.Reshape(x, self.cache)
def backward(self, dy):
return singa.Reshape(dy, self._shape)
To resolve the problem completely, I opened a hotfix at PR #579
the problem is resolved completely
Today when I run the singa/test/python/test_operation.py, I get these errors: