eddelbuettel / rquantlib

R interface to the QuantLib library
119 stars 50 forks source link

unexpected businessDaysBetween() behavior #30

Closed flodel closed 8 years ago

flodel commented 8 years ago

Hello Dirk,

The code below computes the number of business days left until the end of the month:

businessDaysLeft <- function(dates, calendar = "UnitedStates/NYSE") {
   eom <- getEndOfMonth(calendar = calendar, dates = dates)
   businessDaysBetween(calendar = calendar, from = dates, to = eom,
                       includeFirst = TRUE, includeLast = TRUE)
}

Let's take business days for January 2015 as an example (any other month will exhibit the same issue) and compute, for each day, the number of business days left within the month

days <- seq(from = as.Date("2015-01-01"), to = as.Date("2015-01-31"), by = 1)
days <- days[isBusinessDay(calendar = "UnitedStates/NYSE", days)]
businessDaysLeft(days)
 # [1] 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  0

I find the last result (0) unintuitive. If one business day before the last business day of the month, the result is 2 (which makes sense from a trading point of view: two sessions left), then I expect the result to be 1 on the last business day of the month (one session left). The discontinuity in the derivative also points towards an inconsistency.

It seems that the QuantLib code at http://quantlib.sourcearchive.com/documentation/0.8.1/classQuantLib_1_1Calendar_6354ede76e4e889c360ee26fcdca823b.html will return 1 for the last business day of the month, if I understand the code correctly. So maybe it has to do with how the code was ported to R.

Thanks for looking into this.

eddelbuettel commented 8 years ago

At first I thought this was a function of the month we are looking at, but it doesn't seem so. I get similar ... 2 0 outcomes for Sep and Dec 2015 where the last day is not a weekend.

Can you possibly experiment a little with the underlying code and other settings?

eddelbuettel commented 8 years ago

Not setting includeLast=TRUE matters:

edd@max:/tmp$ cat flodel.R 
#!/usr/bin/r

library(RQuantLib)

businessDaysLeft <- function(dates, calendar = "UnitedStates/NYSE") {
    eom <- getEndOfMonth(calendar = calendar, dates = dates)
    businessDaysBetween(calendar = calendar, from = dates, to = eom, includeFirst = TRUE, includeLast = FALSE)
}

days <- seq(from = as.Date("2015-01-01"), to = as.Date("2015-01-31"), by = 1)
print(days <- days[isBusinessDay(calendar = "UnitedStates/NYSE", days)])
print(businessDaysLeft(days))
edd@max:/tmp$ ./flodel.R 
 [1] "2015-01-02" "2015-01-05" "2015-01-06" "2015-01-07" "2015-01-08" "2015-01-09" "2015-01-12" "2015-01-13" "2015-01-14" "2015-01-15" "2015-01-16"
[12] "2015-01-20" "2015-01-21" "2015-01-22" "2015-01-23" "2015-01-26" "2015-01-27" "2015-01-28" "2015-01-29" "2015-01-30"
 [1] 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
edd@max:/tmp$ 
rexmacey commented 8 years ago

A good example to start with is Sep 28, 2015 thru Sep 30, 2015. These are the last 3 trading/business days of that month. I'd like an answer of 3,2,1 (or even 2,1,0). I cannot imagine why these dates would return the same value (e.g. 1,0,0) nor why the difference between two values would exceed 1 (e.g., 3,2,0). They are each 1 business day from their neighbor.

includeFirst=FALSE, includeLast=FALSE returns 1,0,0 
includeFirst=FALSE, includeLast=TRUE returns 2,1,0
includeFirst=TRUE, includeLast=FALSE returns 2,1,0 
includeFirst=TRUE, includeLast=TRUE returns 3,2,0 

From the code

tradingDaysLeftInMonth <- function(myDate, calendar = "UnitedStates/NYSE",IF,IL) {
     EOM <- getEndOfMonth(calendar = "UnitedStates/NYSE",myDate)
     businessDaysBetween(calendar, from = myDate, to = EOM,
                         includeFirst = IF, includeLast = IL)
 }

 days <- seq(from = as.Date("2015-09-01"), to = as.Date("2015-09-30"), by = 1)
 tradingDaysLeftInMonth(days,"UnitedStates/NYSE",FALSE,FALSE)
 tradingDaysLeftInMonth(days,"UnitedStates/NYSE",FALSE,TRUE)
 tradingDaysLeftInMonth(days,"UnitedStates/NYSE",TRUE,FALSE)
 tradingDaysLeftInMonth(days,"UnitedStates/NYSE",TRUE,TRUE)

So it seems that the T/F or F/T are better. But let's look at the rest of the month:

tradingDaysLeftInMonth(days,"UnitedStates/NYSE",FALSE,FALSE)
 [1] 19 18 17 16 16 16 16 15 14 13 12 12 12 11 10  9  8  7  7  7  6  5  4  3  2  2  2  1  0  0
 tradingDaysLeftInMonth(days,"UnitedStates/NYSE",FALSE,TRUE)
 [1] 20 19 18 17 17 17 17 16 15 14 13 13 13 12 11 10  9  8  8  8  7  6  5  4  3  3  3  2  1  0
 tradingDaysLeftInMonth(days,"UnitedStates/NYSE",TRUE,FALSE)
 [1] 20 19 18 17 16 16 16 16 15 14 13 12 12 12 11 10  9  8  7  7  7  6  5  4  3  2  2  2  1  0
 tradingDaysLeftInMonth(days,"UnitedStates/NYSE",TRUE,TRUE)
 [1] 21 20 19 18 17 17 17 17 16 15 14 13 13 13 12 11 10  9  8  8  8  7  6  5  4  3  3  3  2  0

Here are the days (bold marks a non trading day, 9/7 was Labor Day): 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 I'd like a result that could be interpreted as "Including today there are X business days left" which would be: 21 20 19 18 17 17 17 17 16 15 14 13 13 13 12 11 10 09 08 08 08 07 06 05 04 03 03 03 02 01 although I'd be fine with the previous result less 1 which could be interpreted as "After today there are X business days left".

Thank you.

flodel commented 8 years ago

I have written this little c++ script which also gives "suspicious" results:

#include <ql/quantlib.hpp>
#include <iostream>

using namespace QuantLib;

int main() {

   Calendar cal = UnitedStates(UnitedStates::NYSE);
   Date date(1, Jan, 2015);
   Date eom(30, Jan, 2015);

   std::cout << "DATE IS_BUSINESS TT TF FT FF" << std::endl;

   while (date <= eom) {
      bool is_business = cal.isBusinessDay(date);
      int dTT = cal.businessDaysBetween(date, eom, true, true);
      int dTF = cal.businessDaysBetween(date, eom, true, false);
      int dFT = cal.businessDaysBetween(date, eom, false, true);
      int dFF = cal.businessDaysBetween(date, eom, false, false);
      std::cout << date << " " <<
        is_business << " " <<
                dTT << " " <<
            dTF << " " <<
            dFT << " " <<
            dFF << std::endl;
      date++;
   }
}

which gives:

DATE IS_BUSINESS TT TF FT FF
January 1st, 2015 0 20 19 20 19
January 2nd, 2015 1 20 19 19 18
January 3rd, 2015 0 19 18 19 18
January 4th, 2015 0 19 18 19 18
January 5th, 2015 1 19 18 18 17
January 6th, 2015 1 18 17 17 16
January 7th, 2015 1 17 16 16 15
January 8th, 2015 1 16 15 15 14
January 9th, 2015 1 15 14 14 13
January 10th, 2015 0 14 13 14 13
January 11th, 2015 0 14 13 14 13
January 12th, 2015 1 14 13 13 12
January 13th, 2015 1 13 12 12 11
January 14th, 2015 1 12 11 11 10
January 15th, 2015 1 11 10 10 9
January 16th, 2015 1 10 9 9 8
January 17th, 2015 0 9 8 9 8
January 18th, 2015 0 9 8 9 8
January 19th, 2015 0 9 8 9 8
January 20th, 2015 1 9 8 8 7
January 21st, 2015 1 8 7 7 6
January 22nd, 2015 1 7 6 6 5
January 23rd, 2015 1 6 5 5 4
January 24th, 2015 0 5 4 5 4
January 25th, 2015 0 5 4 5 4
January 26th, 2015 1 5 4 4 3
January 27th, 2015 1 4 3 3 2
January 28th, 2015 1 3 2 2 1
January 29th, 2015 1 2 1 1 0
January 30th, 2015 1 0 0 0 0

The results are consistent with R:

library(RQuantLib)
start <- as.Date("2015-01-01")
eom   <- as.Date("2015-01-30")
days <- seq(from = start, to = eom, by = 1)
cal <- "UnitedStates/NYSE"
is_business <- isBusinessDay(calendar = cal, dates = days)
dTT <- sapply(days, businessDaysBetween, calendar = cal, to = eom, includeFirst = TRUE,  includeLast = TRUE)
dTF <- sapply(days, businessDaysBetween, calendar = cal, to = eom, includeFirst = TRUE,  includeLast = FALSE)
dFT <- sapply(days, businessDaysBetween, calendar = cal, to = eom, includeFirst = FALSE, includeLast = TRUE)
dFF <- sapply(days, businessDaysBetween, calendar = cal, to = eom, includeFirst = FALSE, includeLast = FALSE)
print(data.frame(
   DATE = days,
   IS_BUSINESS = is_business,
   TT = dTT,
   TF = dTF,
   FT = dFT,
   FF = dFF
))

In conclusion, nothing wrong with RQuantLib. At most something wrong with QuantLib, or something wrong with my expectations. I am still puzzled as I could swear the QuantLib results are not consistent with the code (pseudocode?) at http://quantlib.sourcearchive.com/documentation/0.8.1/classQuantLib_1_1Calendar_6354ede76e4e889c360ee26fcdca823b.html. I'll keep exploring if I can.

Dirk, thanks (and sorry!)

Florent.

eddelbuettel commented 8 years ago

This function is a pretty simple wrapper around a QuantLib function:

// [[Rcpp::export]]
std::vector<double> businessDaysBetween(std::string calendar, 
                                        std::vector<QuantLib::Date> from, 
                                        std::vector<QuantLib::Date> to,
                                        bool includeFirst=true, bool includeLast=false) {
    boost::shared_ptr<QuantLib::Calendar> pcal(getCalendar(calendar));
    int n = from.size();
    std::vector<double> between(n);
    for (int i=0; i<n; i++) {
        between[i] = pcal->businessDaysBetween(from[i], to[i], includeFirst, includeLast);
    }
    return between;
}

Could either of you poke a toe into the C++ level and see what happens there?

eddelbuettel commented 8 years ago

You rock. I had this sitting in the editor here writing a follow-up when you wrote yours:

// -*- mode: C++; c-indent-level: 4; c-basic-offset: 4; indent-tabs-mode: nil; -*-

// [[Rcpp::depends(RQuantLib)]]

#include <RQuantLib.h>

// [[Rcpp::export]]
double bdaysBetween(QuantLib::Date from, QuantLib::Date to,  bool includeFirst=true, bool includeLast=false) {
    boost::shared_ptr<QuantLib::Calendar> pcal;
    pcal.reset(new QuantLib::UnitedStates(QuantLib::UnitedStates::NYSE));
    return pcal->businessDaysBetween(from, to, includeFirst, includeLast);
}

/*** R
from <- as.Date("2015-09-28")
to <- as.Date("2015-09-30")
bdaysBetween(from, to, TRUE, TRUE)
bdaysBetween(from, to, TRUE, FALSE)
bdaysBetween(from, to, FALSE, TRUE)
bdaysBetween(from, to, FALSE, FALSE)
*/

which from R gives

R>  sourceCpp("/tmp/flodel.cpp")

R> from <- as.Date("2015-09-28")

R> to <- as.Date("2015-09-30")

R> bdaysBetween(from, to, TRUE, TRUE)
[1] 3
R>  sourceCpp("/tmp/flodel.cpp")

R> from <- as.Date("2015-09-28")

R> to <- as.Date("2015-09-30")

R> bdaysBetween(from, to, TRUE, TRUE)
[1] 3

R> bdaysBetween(from, to, TRUE, FALSE)
[1] 2

R> bdaysBetween(from, to, FALSE, TRUE)
[1] 2

R> bdaysBetween(from, to, FALSE, FALSE)
[1] 1
R> 
eddelbuettel commented 8 years ago

But I want to thank you for 'going down to C++ code' to test this. Maybe there is something to be taken away from this after all -- maybe a quick write-up for the Rcpp Gallery on how to test R/C++ interactions (ie by avoiding them :) ?