StatMixedML / LightGBMLSS

An extension of LightGBM to probabilistic modelling
https://statmixedml.github.io/LightGBMLSS/
Apache License 2.0
272 stars 28 forks source link

Out of sample prediction and overall questions #4

Closed waudinio27 closed 2 years ago

waudinio27 commented 2 years ago

Hello Mr. März!

Would this program be suited for the prediction of stock market returns with the StudenT distribution shown in the examples? Would normal Close Price data work with LightGMBLSS as well, or does it struggle with data that was not seen in the train data? Do you plan to implement a Cauchy distribution as well at one point? How would the out-of sample prediction work out for some timesteps into the future after the train test phase?

I think your approach to combine statistics with gradient boosting is an amazing idea. I am a huge fan of probabilistic programming.

Best regards

Matthias

StatMixedML commented 2 years ago

@waudinio27 Thanks Matthias for your interest and the kind words :-)

Concerning your questions:

Would this program be suited for the prediction of stock market returns with the StudenT distribution shown in the examples?

Generally, I am not sure if stock market returns can be predicted with high accuracy at all. So I would leave this to your expertise. Yet, if the returns can be approximated with a StudentT, you can give it a try.

Would normal Close Price data work with LightGMBLSS as well, or does it struggle with data that was not seen in the train data?

This is an important question, since it highlights an important drawback of tree-based models in general to forecast beyond unseen train data. This is true for all tree based models, since the forecasts would be the terminal-node means. Hence, for a dataset with strong trend, basic tree-based models would provide flat forecasts that vary with the seasonality, as shown in the following plot

image

However, there is this option in LightGBM, linear_tree=True that fits a linear model at each leaf. Hence, this allows you to forecast data with strong trend for example, something which is not possible with the default implementations.

Do you plan to implement a Cauchy distribution as well at one point?

Currently, this is not on my list. I suggest you try the StudentT first and see how it goes

How would the out-of sample prediction work out for some timesteps into the future after the train test phase?

You can either train the model for one-step ahead forecasts and then enroll the model along the forecasting steps. This makes sense if you have lagged features or rolling means etc. Another way of doing this is to directly forecast for the entire horizon.

waudinio27 commented 2 years ago

Hello Mr. März!

Thanks a lot for the detailed answers. You should consider at one point publishing your work on Medium to bring it to a bigger audience. I will try to implement a time series with the linear_tree and come back to you for questions if I get stuck, if this is okay for you.

Once again, this is a great program. Maybe you can add at one point a notebook that shows how to forecast as well to the examples.

And yes, you are right. It is possible stock market prices are not predictable. Maybe with your program one could build something to just follow the trend I was thinking today.

Greetings

StatMixedML commented 2 years ago

@waudinio27 Yes sure, let me know if you need some assistance.

StatMixedML commented 2 years ago

@waudinio27 Can I close the issue?

waudinio27 commented 2 years ago

Yes.