Closed payasparab closed 3 years ago
Hey there!
This one is slightly tricky, because the features X do no contain the time stamp. Anyway, at training time I would go for the below solution but it might require some work.
Assuming that you are using a network that can do a forward pass of tensors with arbitrary n_assets
then one can just implement a custom dataloader. Specifically, this dataloader would choose a different n_assets
for each batch based on the timestamps.
Once implemented, it would be yielding X and y of shapes
(batch_size, n_channels, lookback, n_assets_1)
, (batch_size, n_channels, horizon, n_assets_1)
(batch_size, n_channels, lookback, n_assets_2)
, (batch_size, n_channels, horizon, n_assets_2)
(batch_size, n_channels, lookback, n_assets_3)
, (batch_size, n_channels, horizon, n_assets_3)
....
where n_assets_i
< n_assets_overall
. This way the loss computation will not be affected.
Note that at inference time (for new time stamps) one would basically employ the same strategy. The unavailable assets would have implicitly w=0.
I am going to close this issue because of inactivity. Feel free to reopen!
Hey Jan,
Thanks for answering my previous question, I will make a pull request for the minor changes discussed.
I was also wondering how this model could be impacted by a changing number of assets over time? Would this model still be usable? We want to slowly increase the number of assets and we have a different number of assets for training and testing that we think could be useful. Over time the universe of investable assets changes and we want to use DeepDow to account for that.
We initially thought to zero out returns/features on the days that those assets are not available, but were having some trouble with the loss functions (np.nan).
Appreciate the help and we are looking forward to learning more and contributing to this excellent library.
Thanks, Payas