Closed makkarss929 closed 1 month ago
Hey @makkarss929 ,
Thank you for reaching out.
Yes, Darts does look promising based on our quick scan
An integration idea makes sense as well.
However, at the moment, we do not support time dimensions in our data-layer
Theoretically, a function to "initialize" a table or a collection as a time-series object must be defined. Also, some customizations on the .predict
API call need to be done.
How do you plan on integrating with our framework?
It’s simple. We can add a time dimension to our data layer.
In darts, We need 2 things time
and value
in a table or collection to specify a series
. darts will do parsing and sorting according to time for us.
There will be different parameters for different models.
input_chunk_length
, and output_chunk_length
while initializing models.fit()
needs epochs
and series
and covariates
.predict()
needs n
(number of forecasts) and covariates
.NOTE:
covariates
are those who are not part of the forecasting, but help in the forecasting, like day, week, month, temperature, and sales, It can be anything that helps to forecast.
See the below example of LSTM
my_model = RNNModel(
model="LSTM",
hidden_dim=20,
dropout=0,
batch_size=16,
n_epochs=300,
optimizer_kwargs={"lr": 1e-3},
model_name="Air_RNN",
log_tensorboard=True,
random_state=42,
training_length=20,
input_chunk_length=14,
force_reset=True,
save_checkpoints=True,
)
my_model.fit(
train_transformed,
future_covariates=covariates,
val_series=val_transformed,
val_future_covariates=covariates,
verbose=True,
)
pred_series = my_model.predict(n=26, future_covariates=covariates)
There are lots of models and Darts has very neat and clear documentation for all of them like SKlearn.
simple models
with fewer parameters
.What are your thoughts? @anitaokoh
Hi @makkarss929 it's a great idea to potentially add a time-dimension to the Datalayer
, but how would you do this concretely?
Currently, when we do predictions, we use single data points. So the documents look like this:
{
"input_data": [0, 1, 3, 6],
"_outputs": {"input_data": {"my_model": {"0": <output-of-model>}}}
}
However with time-series, I would think you have multiple inputs relevant to a prediction. How would you handle that?
Hi @blythed we can something like this see the example below
{
"input_data": [
{"time": "2024-01-01", "values": [0, 1, 3, 6]},
{"time": "2024-01-02", "values": [1, 2, 4, 7]},
{"time": "2024-01-03", "values": [2, 3, 5, 8]}
],
"_outputs": {
"my_model": {
"2024-0-01": <output-of-model-at-2024-01-01>,
"2024-01-02": <output-of-model-at-2024-01-02>,
"2024-01-03": <output-of-model-at-2024-01-03>
}
}
}
Ok @makkarss929 that's fine, but it doesn't really reflect the real world scenario that new time series data is probably inserted into new records.
We can do something, like this, user will provide input_chunk_length
and output chunk length
we need to modify according to that
{
"data": [
{
"time": "2024-01-01",
"input_data": [0, 1, 3, 6],
"_outputs": {
"my_model": [4, 5] # <output-of-model-at-2024-01-01>
}
},
{
"time": "2024-01-02",
"input_data": [1, 2, 4, 7],
"_outputs": {
"my_model": [10, 17] # <output-of-model-at-2024-01-02>
}
},
{
"time": "2024-01-03",
"input_data": [2, 3, 5, 8],
"_outputs": {
"my_model": [1, 5] # <output-of-model-at-2024-01-03>
}
},
]
}
That won't solve the problem. Imagine you have new data coming in? What do you do with it? Do you add it to an existing document, or put in a new document? What would happen if you keep putting in 1 document?
HI @blythed
splitting the data into separate documents based on a time range
criterion
Contact Details
makkarss929@gmail.com
Feature Description
darts framework (time series forecasting framework), we can implement this with our superduperDB there is no framework related to time series forecasting
Why Darts?
fit()
andpredict()
functions, similar toscikit-learn
.univariate
andmultivariate
time series and models.Use Case Description
It will help a lot of companies, who are consuming time series data and making applications on that.
Organization
weather forecasting, e-commerce, social media, supply-demand. It's used by small to big companies.
Who are the stake-holders?
No response