Closed marthinkondjeni closed 1 year ago
I don't have access to the dataset you mention, but you can follow along the following code
using FluxArchitectures
using DataFrames
using CSVFiles
src = "https://storage.googleapis.com/mledu-datasets/california_housing_train.csv"
data = DataFrame(load(src))
target = :median_house_value
select!(data, Cols(target, :)) # make median_house_value the first column
poollength = 30
datalength = 5000
horizon = 10
features, labels = prepare_data(data, poollength, datalength, horizon; normalise=false)
This creates features
and labels
in the correct format for using the models.
Note that prepare_data
expects the data that is supposed to be predicted in the first column, hence the sorting. Currently, there is an issue with normalising the data when using this function - see #50. I'm about to fix that.
Here is a sample data on sales of shampoo over a three-year period. The aim is to predict the sales using the TPA-LSTM architecture. Here a code I used, but it seems I am getting errors when training the model.
using FluxArchitectures
using DataFrames
using CSVFiles
src = "sales-of-shampoo-over-a-three-ye.csv"
data = DataFrame(load(src))
target = "Sales of shampoo over a three year period"
select!(data, Cols(target, :)) # make median_house_value the first column
poollength = 30
datalength = 7
horizon = 10
features, labels = prepare_data(data, poollength, datalength, horizon; normalise=false)
inputsize = size(features, 1)
hiddensize = 10
layers = 2
filternum = 32
filtersize = 1
model = TPALSTM(inputsize, hiddensize, poollength, layers, filternum, filtersize)
function loss(x, y)
Flux.ChainRulesCore.ignore_derivatives() do
Flux.reset!(model)
end
return Flux.mse(model(x), permutedims(y))
end
Flux.train!(loss, Flux.params(model), Iterators.repeated((features, labels), 10),
Adam(0.02))
but i am getting this error
MethodError: no method matching *(::Float32, ::String)
Closest candidates are:
*(::Any, ::Any, !Matched::Any, !Matched::Any...) at operators.jl:591
*(::T, !Matched::T) where T<:Union{Float16, Float32, Float64} at float.jl:385
*(!Matched::Union{AbstractChar, AbstractString}, ::Union{AbstractChar, AbstractString}...) at strings/basic.jl:260
...
The month data is in string format, which cannot be processed by the models in this repository. You need to do some feature engineering to convert it to a numerical value. You need to play with it yourself to figure out what works best, e.g. taking the month and converting that to 1-12, or giving each day its number of days from the start of the year.
Thank you so much @sdobber
I am trying to implement TPALSTM using my own data
[link](https://www.kaggle.com/datasets/shenba/time-series-datasets/discussion?select=sales-of-shampoo-over-a-three-ye.csv)
. How can I go about it?