enricoschumann / PMwR

Portfolio Management with R: Backtesting investment and trading strategies, computing profit-and-loss and returns, reporting, and more.
http://enricoschumann.net/PMwR
Other
61 stars 11 forks source link

Wrong result with unit_prices function #1

Closed MislavSag closed 1 year ago

MislavSag commented 1 year ago

I have following NAv and cf object:

library(data.table)
NAV = structure(list(timestamp = structure(c(19312, 19313, 19314, 19317, 
                                             19318, 19319, 19320, 19321, 19324, 19325, 19326, 19327, 19328, 
                                             19331, 19332, 19333, 19334, 19335, 19338, 19339, 19340, 19341, 
                                             19342, 19345, 19346, 19347, 19348, 19349, 19352, 19353, 19354, 
                                             19355, 19356, 19359, 19360, 19361, 19362, 19363, 19366, 19367, 
                                             19368, 19369, 19370, 19373, 19374, 19375, 19376, 19377, 19380, 
                                             19381, 19382, 19383, 19384, 19387, 19388, 19389, 19390, 19391, 
                                             19394, 19395, 19396, 19397, 19398, 19401, 19402, 19403, 19404, 
                                             19405, 19408, 19409, 19410, 19411, 19412, 19415, 19416, 19417, 
                                             19418, 19419, 19422, 19423, 19424, 19425, 19426, 19429, 19430, 
                                             19431, 19432, 19433, 19436, 19437, 19438, 19439, 19440, 19443, 
                                             19444, 19445, 19446, 19447, 19450, 19451, 19452, 19453, 19454, 
                                             19457, 19458, 19459, 19460, 19461, 19464, 19465, 19466, 19467, 
                                             19468, 19471, 19472, 19473, 19474, 19475, 19478, 19479), class = "Date"), 
                     NAV = c(970, 2500, 2500, 2494.56, 2526.42, 2541.54, 2541.54, 
                             2540.28, 2500.68, 2496.25, 2570.6, 2568.44, 2565.26, 2520.26, 
                             2485.35, 2480.98, 2499.11, 2481, 2513.97, 2531.74, 2515.96, 
                             2456.99, 2429.14, 2408.53, 2411.28, 2426.09, 2421.59, 2434.36, 
                             2434.36, 2425.3, 2396.49, 2436.8, 2430.37, 2430.37, 2419.23, 
                             2436.5, 2407.55, 2459.38, 2470.14, 2469.76, 2469.38, 2478.34, 
                             2487.58, 2487.58, 2483.2, 2445.13, 2427.65, 2470.71, 2498.07, 
                             2495.1, 2495.61, 2521.62, 1090107.32, 1086971.73, 1086971.34, 
                             1098903.22, 1111018.21, 1102344.67, 1096241.55, 1096241.55, 
                             1096069.72, 1096069.72, 1101335.625, 1107990.41, 1107990.41, 
                             1116275.52, 1123598.4, 1123598.4, 1123598.4, 1123598.4, 1123598.4, 
                             1124678.98, 1105307.14, 1109141.14, 1104793.89, 1100303.32, 
                             1108848.65, 1130268.42, 1130532.8, 1112747.75, 1114419.18, 
                             1093341.95, 1077124.06, 1074777.33, 1074607.62, 1074435.35, 
                             1074435.35, 1074435.35, 1074435.35, 1074435.35, 1056056.925, 
                             1059896.37, 1059896.37, 1060085.77, 1057218.13, 1060825.9678, 
                             1067748.42, 1067748.42, 1068830.855, 1062896.015, 1063951.985, 
                             1067918.395, 1067918.395, 1068319.775, 1068432.975, 1063912.785, 
                             1077848.515, 1075044.865, 1078374.615, 1078902.625, 1078545.505, 
                             1072487.845, 1073146.795, 1073743.695, 1056519.745, 1051870.815, 
                             1072200.585, 1073637.47, 1073110.76, 1072934.49)), row.names = c(NA, 
                                                                                              -120L), class = c("data.table", "data.frame"))
cf <- structure(list(timestamp = structure(c(19312L, 19313L, 19384L), class = c("IDate", "Date")), 
                     `cashflow ` = c(970, 1530, 1089500)), row.names = c(NA, -3L), class = c("data.table", "data.frame"))

I tried to use unit_prices function on this 2 objects (following example):

  nav_units <- unit_prices(as.data.frame(NAV ),
                         cashflow = as.data.frame(cf),
                         initial.price = 100)
plot(nav_units$price)

buy as you can see on the plot, results are worng, there is a big break that shouldn't exists.

enricoschumann commented 1 year ago

The "jump" is implied by the logic of unit_prices: it assumes that the timestamp of the cashflow is after valuation, i.e. you cannot use it on the timestamp (think of "after the close of the market").

In your case:

 timestamp        NAV
2023-01-26    2521.62
2023-01-27 1090107.32

Your inflow:

 timestamp        NAV
2023-01-27   1089500

The function computes the NAV before the inflow, which is 1090107.32 - 1089500 = 607.32. But then 607.32/2521.62 - 1 = -0.76, i.e. your unit price drops by 76%.

This reflected in the output of unit_prices:

     timestamp        NAV     price    units
## ....
52  2023-01-26    2521.62 100.86480    25.00
53  2023-01-27 1090107.32  24.29280 44873.68
## ....
MislavSag commented 1 year ago

Should I than lag inflows?

The function works for some other strategies.

enricoschumann commented 1 year ago

Yes, since your convention for timestamps/cashflows (i.e. cashflow is available at start of period) is not directly supported. But you could get results by slightly changing your data.

## YOUR DATA
> head(NAV)
    timestamp     NAV
1: 2022-11-16  970.00
2: 2022-11-17 2500.00
3: 2022-11-18 2500.00
4: 2022-11-21 2494.56
5: 2022-11-22 2526.42
6: 2022-11-23 2541.54

> NAV[51:53, ]
    timestamp        NAV
1: 2023-01-25    2495.61
2: 2023-01-26    2521.62
3: 2023-01-27 1090107.32

> cf
    timestamp cashflow 
1: 2022-11-16       970
2: 2022-11-17      1530
3: 2023-01-27   1089500

In the first variation, the cashflow appears always t+1 in the NAV.

## VARIATION 1
> head(NAV)
    timestamp     NAV
1: 2022-11-16    0.00  ## << NAV changed
2: 2022-11-17  970.00  ## << NAV changed
3: 2022-11-18 2500.00
4: 2022-11-21 2494.56
5: 2022-11-22 2526.42
6: 2022-11-23 2541.54

> cf
    timestamp cashflow 
1: 2022-11-16       970
2: 2022-11-17      1530
3: 2023-01-26   1089500  ## << date changed

unit_prices(as.data.frame(NAV),
            as.data.frame(cf),
            initial.price = 100,
            cf.included = FALSE)$price
##          ^^^^^^^^^^^^^^^^^^^  ## setting
## VARIATION 2
> NAV[51:53, ]
    timestamp        NAV
1: 2023-01-25    2495.61
2: 2023-01-26 1092021.62 ## <<2521.62 + 1089500
3: 2023-01-27 1090107.32

> cf
    timestamp cashflow 
1: 2022-11-16       970
2: 2022-11-17      1530
3: 2023-01-26   1089500  ## << date changed

unit_prices(as.data.frame(NAV),
            as.data.frame(cf),
            initial.price = 100)$price

Yet one more variation: if the cashflows have been used on the timestamp on which they show up in the NAV series, set the cashflow timestamps to t - 1 and use cf.included = FALSE. But this may require that you include a timestamp before the first date in the NAV series.

MislavSag commented 1 year ago

I choosed 2 option and it works. Thanks.

enricoschumann commented 1 year ago

I suggest to close this issue. The PMwR manual has a brief section on external cashflows and return calculations: http://enricoschumann.net/R/packages/PMwR/manual/PMwR.html#returns-with-external-cashflows . Feel free to reopen it if unexpected behaviour occurs.