OpenBB-finance / OpenBB

Investment Research for Everyone, Everywhere.
https://openbb.co
Other
33.96k stars 3.11k forks source link

[FR] Enable averaging around multiple predictions #240

Closed Felixkruemel closed 3 years ago

Felixkruemel commented 3 years ago

What's the problem of not having this feature? Predictive Models like LSTM will differ a lot between several runs. A command to let it run e.g. 10times and average the results would probably provide better results overall. Describe the solution you would like Just implement a way of specifying to run multiple runs of a single prediction model.

DidierRLopes commented 3 years ago

This is a great idea! This way we can even provide the standard deviation around the average prediction, something along these lines:

Captura de ecrã 2021-03-20, às 14 55 04
Felixkruemel commented 3 years ago

This is a great idea! This way we can even provide the standard deviation around the average prediction, something along these lines:

Exactly! Even for short term predictment that would work way better I guess. And since LSTM is pretty fast that should also be doable in a good timely manner.

And additionally it should not require too much new code as it should not be more than a simple loop and an average function.

DidierRLopes commented 3 years ago

You're right, should have thought about it.

I'm a bit busy this weekend, but I'll try to get to it during the week and let you know in here when done. :)

Felixkruemel commented 3 years ago

@DidierRLopes Whiele I'm thinking about it, could we somehow combine the sentiment analysis with an prediction algorithm? Maybe there is even one available somewhere.

Because better data quality is way more important than just better model

DidierRLopes commented 3 years ago

That's a good idea, although more of a long term goal, I'd say.

Can you try to dig something that is worth giving a go, and I can have a look at it?

Felixkruemel commented 3 years ago

@DidierRLopes Have a look here: https://www.thetechplatform.com/post/sentiment-analysis-for-stock-price-prediction-in-python The chart on the bottom pretty much sums up that a positive sentiment will go to a positive stock continuation.

Can we feed the data of both into lstm? So like not only feed it with historical chart data but additionally with historical sentiment data. You can have a look here: https://github.com/EmielStoelinga/CCMLWI I searched whole Github but couldn't find anything else, may be something which hasn't been done in OpenSource before. So let's do it ;)

I probably should open a new issue for that or?

DidierRLopes commented 3 years ago

Nice!!

Instead of opening a new issue. Feel free to add these 2 ideas to the ROADMAP as PR! The 1st one in Short-term, this 2nd one in Long-term.

I've personally not worked with MISO NN so I'm quite keen in giving it a go. However, I feel like there are some other smaller tasks that have bigger priority. But I'll definitely get around this one.

Regarding that 1st link, I'm pretty sure our ba/sentiment command does what that post shows, it just doesn't output it centred with the stock price. But from there would be just a matter of learning how to feed it to a NN :)

Also, yesterday and today I've been focusing on comparison analysis menu, to compare several stocks. I merged today the comparison between financials (income, balance, cashflow). And just finished developing same comparison for sentiment (using data provided by FinBrain Technologies):

Captura de ecrã 2021-03-20, às 15 57 06

and their correlation:

Captura de ecrã 2021-03-20, às 15 57 01

Won't have time to create the PR today I think, but will do tomorrow :)

Felixkruemel commented 3 years ago

Instead of opening a new issue. Feel free to add these 2 ideas to the ROADMAP as PR! The 1st one in Short-term, this 2nd one in Long-term.

Just opened a PR :)

DidierRLopes commented 3 years ago

Just opened a PR for this. I think it looks really good.

I used the median values, and for confidence interval both 10 and 90% quantiles.

Let me know if you don't agree. But my reasoning is that if we iterate 10 times. And there are some outliers, the median is more robust against that. Whereas the average would take into account an outlier, and shift the prediction incorrectly.

The quantiles it made sense that for every 10 loops, the lower and higher value are not considered.

We can still discuss this. I had as an option to select confidence interval. But I thought that it wasn't worth adding such option, and removed it.

Felixkruemel commented 3 years ago

@DidierRLopes This seems really nice! The quantiles decision I think is pretty robust.

There may be other opinions, but I'm with you there. The result looks very nice!

DidierRLopes commented 3 years ago

Closing this since #252 implements this.

Example of backtesting + loops:

ex1

ex2

Clearly didn't try to tweak the model :)

PS: I'll create a README file on how to tune NN models