HuwCampbell / grenade

Deep Learning in Haskell
BSD 2-Clause "Simplified" License
1.44k stars 84 forks source link

index out of bounds -- in Data.Vector.Generic #48

Closed sebeaumont closed 4 years ago

sebeaumont commented 6 years ago

just tried shakespeare on shakespear 100k sample and got the following...

seb@psi(0) [grenade](2646) 14:06:55> shakespeare ~/Data/misc/shakespeare.txt
TRAINING STEP WITH SIZE: 50
shakespeare: ./Data/Vector/Generic.hs:245 ((!)): index out of bounds (38,37)
CallStack (from HasCallStack):
  error, called at ./Data/Vector/Internal/Check.hs:87:5 in vector-0.12.0.1-3FWV4ejAWV0FsmvNvoLaed:Data.Vector.Internal.Check

What did I do wrong?

sebeaumont commented 6 years ago

Wait on I'm a couple of commits behind. No I'm on the master head.

sebeaumont commented 6 years ago

Can someone try and reproduce? ghc-8.2.2 / resolver: lts-10.2

HuwCampbell commented 6 years ago

G'day,

I bashed out the shakespeare example and included some not so safe code for decoding the results back into characters, which is probably what's biting you here.

Problems happens when there aren't the same number of unique characters as the length of the vector.

I trained it with https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt so you might want to try that version of Shakespeare's texts.

The offending terrible code is in the https://github.com/HuwCampbell/grenade/blob/master/src/Grenade/Utils/OneHot.hs

If you're interested, it should be quite possible to abstract the network shape to a parameter i, then use a withSomeNat pattern to make this safe for any input.

HuwCampbell commented 6 years ago

Bit of a kludge here (a whole 30 seconds of work), but it should prevent the runtime error and allow people to adjust the size of the network to fit.

https://github.com/HuwCampbell/grenade/pull/49

sebeaumont commented 6 years ago

Thanks Huw, That gives me a leg up for generalising it for any input. I did get a problem in OneHot with some unrelated input so I'll give it a bash.