Open matthewgilbert opened 6 years ago
Presently the current holdings are stored as generics in current_exp
and the
marked to market (MTM) daily holdings PnL is calculated based on continuous returns
for generics. An arguably more succint implementation would track instrument
holdings and marked to market PnL would be based on instrument prices. All PnL
and holdings stored data would be actual instrument data (with meta data
referring to generics). Generic holdings and PnL (e.g. ES1) would be calculated
after the simulation from stored meta data.
Currently the logic of the simulation is as follows
Calculate daily PnL from continuous returns and generic notional holdings
Store daily PnL
Update current generic notional holdings based on continuous returns
For rebalancing dates
Calculate instrument contract trades based on signal for generics and the current instrument contract holdings
Update notional generic holdings
Update current instruments
Calculate notional generic trades
Store notional trades
Store notional exposures
Doing this on an instrument level, the logic would be modified to
For rebalancing dates
For signals on this date
Map previous day EOD tradeable instrument holdings + cumulative daily trades to current generic notional holdings
Calculate desired generic notional from target portfolio and current generic notional holdings(optional if an opti is desired)
Calculate instrument trades based on desired generic notional and current instrument contract holdings
Store tradeable instrument trades in TRADES
Calculate EOD holdings PnL from tradeable instrument prices
Store EOD holdings PnL in EOD_HOLDINGS_PNL
Calculate trading PnL for tradeable instruments
Store trading PnL in TRADING_PNL
Update EOD tradeable instrument holdings
Store EOD tradeable instruments in EOD_HOLDINGS
TRADES
timestamp | EOD date | instrument | generic | multiplier | price | # of contracts | weight
2015-01-02T10:30:00 | 2015-01-02 | ESH2015 | ES1 | 50 | 2000.25 | 5 | 1
2015-01-02T10:30:00 | 2015-01-02 | CLH2015 | CL1 | 1000 | 50.52 | 3 | 0.5
2015-01-02T10:30:00 | 2015-01-02 | CLH2015 | CL2 | 1000 | 50.52 | 3 | 0.5
2015-01-02T18:30:00 | 2015-01-03 | ESH2015 | ES1 | 50 | 2002.25 | 5 | 1
EOD_HOLDINGS
timestamp | instrument | generic | multiplier | price | # of contracts | weight
2015-01-02T16:15:00 | ESH2015 | ES1 | 50 | 2000.25 | 5 | 1
2015-01-02T16:15:00 | CLH2015 | CL1 | 1000 | 50.73 | 3 | 0.5
2015-01-02T16:15:00 | CLH2015 | CL1 | 1000 | 50.73 | 3 | 0.5
EOD_HOLDINGS_PNL
timestamp | instrument | generic | weight | PnL
2015-01-02T16:15:00 | ESH2015 | ES1 | 1 | 373.33
2015-01-02T16:15:00 | CLH2015 | CL1 | 0.5 | -143.50
2015-01-02T16:15:00 | CLH2015 | CL2 | 0.5 | -143.50
TRADING_PNL
timestamp | instrument | generic | weight | PnL
2015-01-02T16:15:00 | ESH2015 | ES1 | 1 | -20.50
2015-01-02T16:15:00 | CLH2015 | CL1 | 0.5 | 50.50
2015-01-02T16:15:00 | CLH2015 | CL2 | 0.5 | 50.50
Trading PnL
= Net trades from t-1 to t MTM at EOD t prices
- Aggregated Cost Basis
where Aggregated Cost Basis
is the the aggregated cost of buys minus the
aggregated cost of sells for an instrument from the day of trading, e.g.
2015-01-01T20:30:00 | 5 ESH2015 | 2000.25
2015-01-02T11:30:00 |-5 ESH2015 | 2001.25
2015-01-02T14:30:00 | 3 ESH2015 | 2000.75
Settlement price
2015-01-02T16:15:00 | ESH2015 | 2002.00
Aggregated Cost Basis = ((5x2000.25 + 3x2000.75) - 5x2001.25) x 50 = 299,862.50
Net trades at EOD from t MTM at prices t = (5 - 5 + 3) x 2002.00 x 50 = 300,300.00
Trading PnL = 300,300.00 - 299,862.50 = 437.50
For attribution, the weight should be the contract weighted average weight
Instruments at EOD from t-1 MTM at prices t
- Instruments at EOD from t-1 MTM at prices t-1
For attribution, weights are given be weights from t-1
Instruments at EOD from t-1
+ Net instrument trades during t
EOD Generic Holdings
= EOD Holdings
groupby timestamp
and generic
apply sum of multiplier
x price
x # of contracts
x weight
Generic PnL =
PnLgroupby
timestampand
genericsum of
PnLx
weight`
There are days where no pricing data for a subset of the held instruments, so we
need to use a filled forward value to MTM the holdings. This will currently
result in an error when attempting to access the pricing data using
get_xprices
.
Distinguishing between dates, datetimes + TZ and time ranges. For intraday need to have instrument weights roll codified with datetime not just date.
Possible speed issues with MTM logic mapping to generics everyday instead of using continuous returns
Currently trading is done in one step using mapper.util.calc_trades
. Based
on instrument weight allocations, this calculates trades based on differences
between current number of instrument contracts held and desired generic notional
holdings. The comparison is done at the instrument level (e.g desired notional
holdings is mapped to desired tradeable contracts) to allow for rolling. This
does not currently support an optimization, which would likely be done at the
generic level (the optimization in mind is an optimization at the generic
notional level with piecewise linear TCost curves in notional space)
The goal for adding intraday functionality would be for allowing rebalancing several times a day. One of the main use cases would be for allowing trading prices which are different than EOD mark to market prices. The goal is not to allow intraday rebalances at something like 5 minute frequencies, as this would require a major rethink in order to be performant at this granularity.
Currently the
simulation()
suffers from a few hacky workarounds which would make it difficult to add intraday functionality.Issues
the simulation runs over a set of dates and the weights for generics have values for each day. However, on a roll day there is actual two weights for an instrument, the before trade weight and the after trade weight. For example, the weights for TY1 are:
this means that, up until 2015-02-24 TY1 is entirely allocated to the 2015TYH contract, whereas after 2015-02-24, TY1 is entirely allocated to the 2015TYM. The implicit assumption here is the the roll is happening at the time associated the instrument prices for 2015-02-24 (in most examples the settlement price).
PnL is calculated as
daily_pnl
. To allow intraday functionality this should be disaggregated into mark to market PnL from overnight holdings (holdings_pnl
) and daily PnL from trading (trading_pnl
).for
simulation()
to perform correctly the implementations ofrebalance_dates
andinstrument_weights
in the concrete class have to be consistent in the following sense:rebalance_dates
needs to include every transition that happens in the weightings for instruments. If the previous condition is not specified, PnL and holdings will be calculated as if a roll has happened when in fact it has not.the current implementation is difficult to test since the functionality is quite monolithic and handles trading, updating holdings and calculating PnL all within the scope of the simulation.
Possible Solutions
Currently the simulation looks at
tradeable_dates
andrebal_dates
. One solution would be to createmtm_datetimes
(a more appropriate name fortradeable_dates
) andrebal_datetimes
. Including time aware dates would also force all pricing data to be timestamped, so if not available from the source this would have to be added on a best guess basis. The main loop structure would also have to be changed to allowrebal_datetimes
to account for the possibility of multiple times per day.Since all instruments would have the same mark-to-market time of day, this would make using the settlement price problematic since there is asynchronicity between instrument settlement times. In addition if there were instruments without any overlapping trading hours this would also be problematic.
The redesigned workflow would look like
Another possible change is to run the simulation on actual tradeable instruments and handle the mapping to generics afterword, just for accounting purposes when doing PnL attribution to generics. This could possibly simplify the logic within the loop of the simulation.
As an addendum, the
trade()
andnotional_exposure()
APIs are currently a bit messy in the sense that it takes inweights
. It is more natural to abstract these away from the user call however this leads to a less performant implementation which has ramifications forsimulate()