braverock / quantstrat

288 stars 115 forks source link

Error "Walk.forward()": Error in dividing dataset for "days" period? #90

Open N0talent opened 6 years ago

N0talent commented 6 years ago

when i run the demo "/demo/luxor.8.walk.forward.R" i get following Error: Error in[.xts(symbol.data, testing.start.v) : subscript out of bounds

I looked into "walk.forward.R" and narrowed the "problem" down. I seemd to me as if walk.forward() is deviding the preiods wrong. In my case the Ending point of the is nrows(mktdata)+1

testing.start.v <- 1+training.end.v I think here is the "problem" in row 141 in walk.forward.R

N0talent commented 6 years ago

[first Analysis] this error come from the way training.end.v is calculated. It works fine if k.traing and k.testing ist >1. for example: walk.forward(.....,k.training = 36, k.testing = 12, ....) i get following Testing and Training intervals:

                 [,1] [,2] [,3] [,4] [,5]
training.start.v    1  232  483  735  987
training.end.v    734  986 1239 1490 1741
testing.start.v   735  987 1240 1491 1742
testing.end.v     986 1239 1490 1741 1865

walk.forward(.....,k.training = 2, k.testing = 1, ....) i get following Testing and Training intervals:

                 [,1] [,2] [,3] [,4] [,5] [,6]
training.start.v    1    1 1489 2833 4319 5759
training.end.v   1488 2832 4318 5758 7246 8014
testing.start.v  1489 2833 4319 5759 7247 8015
testing.end.v    2832 4318 5758 7246 8014 8014

Note: obv example one is created on a different symbol.data[] than example 2.

N0talent commented 6 years ago

[Analysis 2] walk.forward(.....,k.training = 36, k.testing = 12, ....) i get following Testing and Training intervals:

                 [,1] [,2] [,3] [,4] [,5]
training.start.v    1  232  483  735  987
training.end.v    734  986 1239 1490 1741
testing.start.v   735  987 1240 1491 1742
testing.end.v     986 1239 1490 1741 1865

the dates behind the calculated Index are:

 training.start training.end testing.start testing.end
1     2007-01-03   2009-11-30    2009-12-01  2010-11-30
2     2007-12-03   2010-11-30    2010-12-01  2011-11-30
3     2008-12-01   2011-11-30    2011-12-01  2012-11-30
4     2009-12-01   2012-11-30    2012-12-03  2013-11-29
5     2010-12-01   2013-11-29    2013-12-02  2014-05-30

k.training are 36 months and k.testing are 12. All the Calculated periods here are 1 month short.

I would like tp propose following sollution: replace Current Code (row 127-132):

  training.end.v   <- ep[c(k.training,k.training+cumsum(rep(k.testing,as.integer((length(ep)-k.training)/k.testing))))]
  if( is.na(last(training.end.v)) ) {
    training.end.v <- training.end.v[-length(training.end.v)]
  }

  training.start.v <- c(1,1+ep[cumsum(rep(k.testing,as.integer((length(ep)-k.training)/k.testing)))])

with following (new) code:

  # construct the subsets to use for training/testing
  #define first Training interval
  first.training.end<-1+k.training    
  #Calculate how many complete training periods are in the Dataset. 
  len<-length(ep[-length(ep)])    #Removed last period, because ep[nrow(ep)]-ep[nrow(ep)-1] isnt always a full period (e.g.period = 'months'= 01.01-01.15)
  trainingperiods.total<-as.integer((len-first.training.end)/k.testing)
  training.steps<-rep(k.testing,trainingperiods.total)

  #construct index for ep
  ii<-first.training.end+cumsum(training.steps)
  i<-c(first.training.end,ii)
  training.end.v   <- ep[i]

  #cunstruct Training starting points by subtracting k.training from calculated training endpoints
  first.training.start<-1
  i<-ii-k.training 
  training.start.v <- c(first.training.start,ep[i])

using this Code, the dates behind the calculated Index are:

  training.start training.end testing.start testing.end
1     2007-01-03   2009-12-31    2010-01-04  2010-12-31
2     2007-12-31   2010-12-31    2011-01-03  2011-12-30
3     2008-12-31   2011-12-30    2012-01-03  2012-12-31
4     2009-12-31   2012-12-31    2013-01-02  2013-12-31
5     2010-12-31   2013-12-31    2014-01-02  2014-05-30
ssh352 commented 5 years ago

same problem here

braverock commented 5 years ago

@ssh352 you need to give us more context, and an example, beyond 'same problem here'.

Which version of the code are you running? Where is the reproducible example.

It is not clear to me that this issue is real, and the proposed patch has conflicts with the trunk, and whitespace changes which make it very difficult to evaluate the patch.

braverock commented 5 years ago

I'll provide a little more context.

There are always going to be combinations of training and testing where the last training set could end on the last observation of the data, or there are not enough observations for a complete training or testing set.

I do not believe that starting from -1 as an index solves this problem in a general way.

I added changes in March 2019 to catch and correct for three more of these edge cases, but I won't be terribly surprised if I have still missed one or more edge cases.

So if you are experiencing this problem, we need a parsimonious reproducible example, and a parsimonious patch or diagnosis to fix the problem. Otherwise, I can only fix things that I can demonstrate and test against.

braverock commented 5 years ago

As a note, I am pretty sure other recent changes covered a variety of edge cases that I mentioned above, including the edge case mentioned by the original poster. I looked closely at the proposed changes, and added one that handles the endpoint of the training periods, which, as far as I can tell, is the only unresolved issue.

Please test and report. Thanks!

ssh352 commented 5 years ago

@braverock sorry for not providing more details.

I sourced luxor.1 to luxor.8 in demo directory one by one. Two of them have errors: the first one is when I run luxor.4.paramset.timespan.R

Screen Shot 2019-03-28 at 9 29 37 AM

the second one is when I run luxor.8.walk.forward.R

Screen Shot 2019-03-28 at 9 31 52 AM

my sessioninfo

Screen Shot 2019-03-28 at 9 32 54 AM
braverock commented 5 years ago

@ssh352 Thanks for the additional detail! That is really helpful.

Could you try updating quantstrat to the current trunk please?

I think I have fixed both issues that you show in your screen shots, so validating that would be a great service.

ssh352 commented 5 years ago

@braverock So I updated quantstrat using devtools. I ran luxor.1 to 8 one by one.

at luxor.3.paramset.sma.R, there is no error, but the result is suspicious, stats shouldn't be NULL

Screen Shot 2019-03-28 at 10 14 13 AM

at luxor.4.paramset.timespan.R, stats shouldn't be NULL

Screen Shot 2019-03-28 at 10 14 55 AM

I got the following errors at for the luxor.6 scripts.

Screen Shot 2019-03-28 at 9 45 03 AM Screen Shot 2019-03-28 at 9 43 51 AM

at luxor.8 there is a new error

Screen Shot 2019-03-28 at 9 46 03 AM
jaymon0703 commented 5 years ago

Re-opening this as i am getting dubious results for Luxor as well, and we have identified issues with WFA when running for other non-demo portfolios. Depending on progress for those portfolios and whether the fix resolves Luxor remains to be seen. Nevertheless, we are actively looking into WFA.

ssh352 commented 5 years ago

@jaymon0703 Guy Yollin has used quantstrat to demonstrate WFA, so it must have worked in previous versions before. Some new code must have caused regression. Will it help to add more unit and regression tests to quantstrat to catch such regression? Thank you!

jaymon0703 commented 5 years ago

There is likely regression, although macdWFA runs to completion it still has a test portfolio that is polluted with training window transactions. @braverock is working on making the test portfolio unique which should resolve that and hopefully other issues. Tests would be great and pull requests are welcomed. Running the demos with Jenkins was floated as an option.

jaymon0703 commented 5 years ago

hi @ssh352 i am able to get results for all the luxor demos (incl the luxor.6 scripts) on version 0.15.6 except for the last one luxor.8.walk.forward which returns no results for the objective function. This is likely a symptom of too short a backtest. We will be adding more length to the GBPUSD dataset in due course.

jaymon0703 commented 5 years ago

hi @ssh352 can you please re-test using version 0.16.1? if this is not an issue anymore, then we should close it. we have done a fair bit with walk.forward over the last few months...mainly for ensuring the test portfolio is not polluted with train transactions. our testing indicates everything is working as expected.

ssh352 commented 5 years ago

@jaymon0703 Hi sorry must have missed the earlier reply. I sourced luxor.1 to luxor.8 in demo directory one by one. I only got an error on Luxor.8

Screen Shot 2019-06-08 at 11 45 06 PM
jaymon0703 commented 5 years ago

Thanks @ssh352 i get the same error and suspect it has to do with the limited number of observations in the market data. I think for now we can leave the issue open until we increase the size of the market data, then re-test. Im confident all is in order as we have run a number of demos (macdWFA for example) and proprietary walk forward analyses in the last few weeks/months and the output seems sound.

ssh352 commented 5 years ago

@jaymon0703 Hi macdWFA still has error, Please advise.

Screen Shot 2019-06-25 at 1 59 07 PM Screen Shot 2019-06-25 at 1 58 03 PM
jaymon0703 commented 5 years ago

Hmm I will check in a few hours.

On Tue, 25 Jun 2019, 08:18 shenry, notifications@github.com wrote:

@jaymon0703 https://github.com/jaymon0703 Hi macdWFA still has error, Please advise.

[image: Screen Shot 2019-06-25 at 1 59 07 PM] https://user-images.githubusercontent.com/41779116/60073817-05b05500-9754-11e9-8f9d-e264d2755615.png

[image: Screen Shot 2019-06-25 at 1 58 03 PM] https://user-images.githubusercontent.com/41779116/60073820-0648eb80-9754-11e9-99d7-bcbe05b91d3b.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/braverock/quantstrat/issues/90?email_source=notifications&email_token=ABBDTPAK6X2E6EKHCKEYFC3P4G2DFA5CNFSM4FHMVCFKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYPESEA#issuecomment-505301264, or mute the thread https://github.com/notifications/unsubscribe-auth/ABBDTPGCZYCQ52YNC5VWI73P4G2DFANCNFSM4FHMVCFA .

ssh352 commented 5 years ago

update: I just updated blotter to newest version, from blotter 0.14.2 to HEAD (0.14.3). it fixed it.

Screen Shot 2019-06-25 at 3 55 22 PM
jaymon0703 commented 5 years ago

Very good...it is a script we test with and which would have passed, so i was surprised to see your error, although im unsure how a blotter update would have resolved it. Nevertheless, glad its working. Still on my todo list is to update the GBPUSD mktdata object to include more observations so we can properly test Luxor.8 and hopefully close this item...