braverock / quantstrat

287 stars 115 forks source link

Implement Triple Barrier Method from "Advances in Financial Machine Learning" - De Prado 2018 #105

Open jaymon0703 opened 5 years ago

jaymon0703 commented 5 years ago

De Prado (https://www.amazon.com/Advances-Financial-Machine-Learning-Marcos/dp/1119482089) introduces the Triple Barrier Method concept for labeling observations in a potential ML model for time series. The 3 barriers are 2 horizontal bars (representing profit-taking and stop-loss prices) and one vertical bar (representing an expiration timespan, ie. x number of bars). Where the top horizontal bar is touched first, the observation gets a label of +1. If the lower horizontal bar is touched first, the observation gets a label of -1. If the vertical bar is touched first, De Prado suggests 2 options...either give the observation the sign of the return or a zero. Your choice should depend on the problem you are trying to solve.

De Prado says the output from the function should be a dataframe containing the timestamps at which any barrier was touched. De Prado also mentions 8 possible configurations of the "barrier triplet" in which the barriers are enabled or disabled reflecting different objectives and constraints. For our replication of the method, we should consider all configurations, some of which share some overlap with the trading strategy posed by Olsen et al in https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2951348.

There could be a few applications of this method in quantstrat but our initial focus will be for extending it to signal analysis for any periodicity.

TODO: Add more info?

braverock commented 3 years ago

There is R code for the triple barrier method in

http://www.mlfactor.com/Data.html#the-triple-barrier-method

it is all tidy, and focused on labeling for supervised learning. It should be a good start.

cl7805 commented 3 years ago

Hi, marvellous explanation above. The example given in the book is intraday. I'm doing swing trading amd I'm stuck on the distinction between Timedelta and span. What does span=100 mean? 2nd, I have a KeyError on Timestamp using yahoo finance, does it mean I should discard it as I assume it means it is missing data?

jaymon0703 commented 3 years ago

Please share link to R code? I cannot see it anywhere, nor the terms you mention...assuming you are referring to http://www.mlfactor.com/Data.html#the-triple-barrier-method

cl7805 commented 3 years ago

https://towardsdatascience.com/financial-machine-learning-part-1-labels-7eeed050f32e

cl7805 commented 3 years ago

No, it's python