Try to train a model with the following parameters:
./perceiver train -i train.shuf.csv --test-file validation.csv -j 32 -s 21 -k 2 -g 1 -e 1 -o imdb-minimal-2l.model -n 1
At some random point in training the following error will occur:
goroutine 4813637 [running]:
github.com/nlpodyssey/spago/mat.(*Dense[...]).Rows()
:1 +0x9
github.com/nlpodyssey/spago/mat.SameDims[...]({0x1584880?, 0x0}, {0x1584880, 0xc00b16d950})
/Users/julianoviana/Development/spago/mat/matrix.go:231 +0x38
github.com/nlpodyssey/spago/mat.(*Dense[...]).AddInPlace(0x0, {0x1584880, 0xc00b16d950})
/Users/julianoviana/Development/spago/mat/dense.go:514 +0x5f
github.com/nlpodyssey/spago/ag.(*Operator[...]).AccGrad(0xc0588a3b00, {0x1584880, 0xc00b16d950})
/Users/julianoviana/Development/spago/ag/operator.go:165 +0x122
github.com/nlpodyssey/spago/ag/fn.(*Mul[...]).Backward.func2()
/Users/julianoviana/Development/spago/ag/fn/mul.go:64 +0x18d
created by github.com/nlpodyssey/spago/ag/fn.(*Mul[...]).Backward
/Users/julianoviana/Development/spago/ag/fn/mul.go:58 +0x2a5
Debugging the problem I have found that in operator.go it is possible to observe o.grad!=nil and at the same time reflect.ValueOf(o.grad).IsNil()==true. That means somewhere a nil pointer is being cast to mat.Matrix[t] and stored in o.grad. Since mat.Matrix is an interface, the ==nil test will return false but any method call will panic as mat.Dense does not consider the possibility of a nil pointer value.
The following seems to fix it, albeit with the use of reflection:
```
diff --git a/ag/operator.go b/ag/operator.go
index d0167911..47c39738 100644
--- a/ag/operator.go
+++ b/ag/operator.go
@@ -157,11 +157,12 @@ func (o *Operator[T]) AccGrad(grad mat.Matrix[T]) {
o.cond.L.Lock()
defer o.cond.L.Unlock()
- if o.grad == nil {
+ if o.grad == nil || reflect.ValueOf(o.grad).IsNil() {
o.cond.L.Unlock()
o.grad = o.Value().ZerosLike()
o.cond.L.Lock()
}
+
o.grad.AddInPlace(grad)
if o.inBackward && atomic.AddInt64(&o.pendingGrads, -1) == 0 {
```
Not sure this is an acceptable fix - but couldn't find any place where this weird grad is being set either...
At some random point in training the following error will occur: goroutine 4813637 [running]: github.com/nlpodyssey/spago/mat.(*Dense[...]).Rows()