Closed sdobber closed 2 years ago
The actual raw data loads differently on a x64 and arm64, with the latter being wrong.
I had a similar problem. It has nothing to do with GPUs, but does fail more loudly on em.
prepare_data
creates an uninitialized array with similar
. and then doesn't fill it all up. Leading to random data being read as float, with some being NaNs.
To make it really obvious I replaced the similar
call with zeros(Float32, ... )
:
function prepare_data(data, poollength, datalength, horizon; normalise=true)
extendedlength = datalength + poollength
extendedlength > size(data, 1) && throw(ArgumentError("datalength $(datalength) larger than available data $(size(data, 1) - poollength)"))
(normalise == true) && (data = Flux.normalise(data, dims=1))
features = zeros(Float32, size(data, 2), poollength, 1, datalength) # CHANGED THIS
for i = 0:poollength - 1
for j = poollength:datalength
# \/ this j starts at poollength => 1:(poollength-1) will always be uninit data
features[:,i + 1,1,j] = data[j - i,:]
end
end
labels = circshift(data[1:datalength,1], -horizon)
return features, labels
end
and then
prepare_data(ones(Float32, 510, 3), 10, 500, 7, normalise=false)
> 3×10×1×500 Array{Float32, 4}:
[:, :, 1, 1] =
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
[:, :, 1, 2] =
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
[:, :, 1, 3] =
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
...
[:, :, 1, 498] =
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
[:, :, 1, 499] =
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
[:, :, 1, 500] =
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
There should be no zeros (aka uninit memory), but there is.
Edit, added comments to make the bug really obvious.
@KingBoomie Thanks a lot for spotting this!
Thanks for the quick fix! This is now the most useful time series analysis package for me. <3
input, target = get_data(:exchange_rate, poollength, datalength, horizon) |> gpu
works fine on the CPU, but creates some NaNs at the beginning of the dataset (at least on a Jetson Nano).