Closed jaredbroad closed 9 years ago
Here are my first thoughts on where to get started:
Step 1
Step 2
Waiting for comments and suggestions...
I like the design using Trade objects. This will simplify creating rolling windows of statistics (e.g. Get all trades between X-Y, generate statistics).
Does this support fractional opens/partial trade closes? E.g. I open at $10 (x10), open more at $12 (x10), close 10 at $15, open 10 more at $14, close all 20 at $13. What is the entry price for first close? second close?
Will you fill up the trade quantities like "stacked cups being filled with water"? (FIFO // First trade, first quantities filled exit price). If so; say Trade-2 is only partially filled with exit price $12, and next exit was at $13, would it be averaged down or would it be separated into new trade objects?
Currently Lean has an average based portfolio system (share price gets averaged on purchase). Gain/loss calculations/dates for US taxes are done by FIFO system I believe, but Im not sure how FIFO accounting is done.
For trade counting/generation I would consider one of these two options:
Option 1 (Single Trade) a trade is closed only when the position size crosses above/below zero and open price (avg price) is updated on each fill until the trade is closed. Quantity = 30 Direction = Long EntryPrice = AvgBuyPrice = (1010+1012+1014)/30 = $12 ExitPrice = AvgSellPrice = (1015+20*13)/30 = $13.66
Option 2 (Multiple Trades) a trade is closed on each fill in the opposite direction (using FIFO) and open price (avg price) is updated on each fill in the same direction. Trade 1: Quantity = 10 Direction = Long EntryPrice = AvgBuyPrice = $10 ExitPrice = AvgSellPrice = $15 Trade 2: Quantity = 20 Direction = Long EntryPrice = AvgBuyPrice = (1012+1014)/20 = $13 ExitPrice = AvgSellPrice = $13
I would lean towards Option 2 because it is more detailed and trades generated in this way can always be grouped to obtain the same trades of Option 1 (if a client application needs to): Trade 1+2 (grouped): Quantity = 10+20 = 30 Direction = Long EntryPrice = (1010+2013)/30 = $12 ExitPrice = (1015+2013)/30 = $13.66
I don't know about US taxes, but using FIFO to close trades seems a reasonable choice.
Awesome, I lean towards option 2 as well @StefanoRaggi. I recommend doing more research to find out what accounting systems others have used to ensure our statistics results are directly comparable to other platforms. Maybe there's some reference or expert on the subject we can ask?
Any chance a property for a StopLoss
(trailing or hard) could be added to the Trade
class?
This would allow calculation of RiskAdjustedEquity
. Risk-adjusted equity would allow determination of true remaining cash for establishing positions, and thus affecting position sizing algorithms.
This may not be critical for stocks where cash outlay is upfront, but 'cashless' trading such as futures (when support is added in the not too distant future I hope) would benefit from this.
For example:
x
ticks away from entry priceRiskAdjustedEquity
I would prefer to be able to pass in RiskAdjustedEquity
for summary statistics of any algorithms, as by the nature of some strategies I look to implement, and the trade/money/risk management processes I follow, the RiskAdjustedEquity
gives a "truer" view of my performance, resulting in more realistic equity curves. While some my balk at using this because it may cut down their (over-inflated) view of performance at some points in time, it would also smooth out drawdowns and help set expectations clearly around what equity is in the portfolio.
Buy its nature, the stop loss on a futures contract acts like a call option, but with no outlay of the premium up-front, the premium is "paid" on exit by hitting the stop loss, so being able to risk-adjust equity allows for this "premium" to be deducted from portfolio equity.
Whether this suggestion goes against accounting rules (mark-to-market) is not relevant, instead it should be considered an extension to those statistics that are already available (which already do provide a mark-to-market view of equity).
Hey @adriantorrie its out of scope for this change. A StopLoss would be an order type -- these are accounted for here: https://github.com/QuantConnect/Lean/tree/master/Common/Orders
@adriantorrie sorry I just re-read this after a coffee; I totally misunderstood your question on first read! :) Its a very interesting idea, lets come back to it after this initial refactor as a new issue.
@jaredbroad I have completed Step 1 (two last commits here for review: https://github.com/StefanoRaggi/Lean/commits/statistics)
I ended up creating a new TradeBuilder class that generates Trade objects in real-time during the execution of the algorithm. This was required to calculate MAE and MFE for each trade (as a byproduct, the ClosedTrades property can also be used by the algorithm itself for trading decisions).
There are three fill grouping methods available (FillToFill, FlatToFlat, FlatToReduced) and two fill matching methods (FIFO and LIFO). At the moment I set FillToFill and FIFO as default values (we can add two new settings in config.json later). Unit tests for all six combinations are in place.
I would like a quick review before proceeding with Step 2.
Code looks great @StefanoRaggi. I noticed we've added a TradeDirection
enum, it seems like a duplicate of OrderDirection
enum. Also, there's a new TradeExecution
class which has similar data as OrderEvent
. I think the one thing OrderEvent
doesn't have is the Time, which I think it should have. One other thing was placing the TradeBuilder
on the IAlgorithm
interface. I wonder if there's a way to structure it so we don't need that, if not it's not a huge deal, I just like to keep IAlgorithm
as clean as possible and I view this TradeBuilder
as algorithm impl specific, but not required by the engine to run.
Other than that code looks great man. Good code style, doc, and even 2k lines of unit tests!!
Incredible PR @StefanoRaggi, there's a lot to digest here
I'm new to a lot of these concepts - can you describe/link them? (Fill To Fill, Fill To Flat, Flat to Reduced). It would be great to understand their intent better so we can review better.
I asked a quant working at a hedge fund and he suggested leaving the default statistics generation source as portfolio average fill prices (like it is at the moment). He said for their internal math its all portfolio average price, but then the accountants use FIFO etc to minimize the tax bill. Does this trade system support a trade technique like how it is now? (perhaps that is that one of the methods above?)
@mchandschuh, thanks for the review:
TradeDirection { Long, Short }
enum because the OrderDirection { Buy, Sell, Hold }
didn't sound good as a Trade
property; I always read about long/short trades not buy/sell trades. :smile: Personally I would keep the new enum, if you don't mind.Time
in the OrderEvent
class, but just didn't want to add it at this point (as I needed it only in one spot) so I went with the new TradeExecution
class. No problem: I can add the missing property and set it to algorithm.UtcTime
after OrderEvent
creation.TradeBuilder
instance, I put it on the IAlgorithm
interface for a simpler implementation without any refactoring: I need to call TradeBuilder.AddExecution
from BrokerageTransationHandler.HandleOrderEvent
and call TradeBuilder.SetMarketPrice
from AlgorithmManager.Run
. The other reason was the TradeBuilder.ClosedTrades
property that is needed at the end of Engine.Run
and could be used inside the algorithm as well. Honestly, I couldn't find a better place for it. Do you have any suggestions ?TradeBuilder.ClosedTrades
is a substitute for algorithm.Transactions.TransactionRecord
and will be required as an input to the new AlgorithmPerformance.GenerateStatistics
(to be developed in Step 2).@jaredbroad The definition of the three fill grouping methods are the following:
FillToFill
: A Trade is defined by a fill that establishes or increases a position and an offsetting fill that reduces the position size.
Example: Buy 1, Buy 1, Buy 2, Sell 1, Sell 3 would generate three trades
FlatToFlat
: A Trade is defined by a sequence of fills, from a flat position to a non-zero position which may increase or decrease in quantity, and back to a flat position.
Example: Buy 1, Buy 1, Buy 2, Sell 1, Sell 3 would generate one trade
FlatToReduced
: A Trade is defined by a sequence of fills, from a flat position to a non-zero position and an offsetting fill that reduces the position size.
Example: Buy 1, Buy 1, Buy 2, Sell 1, Sell 3 would generate two trades
Here are a couple of links from where I got some help: https://www.sierrachart.com/index.php?l=doc/doc_TradeActivityLog.php#OrderFillMatchingMethods https://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/quantstrat/sandbox/backtest_musings/strat_dev_process.pdf?root=blotter (pg. 24-25)
At the end of Engine.Run
we have a complete list of Trade objects generated according to the chosen fill grouping and matching methods. This list is meant to replace algorithm.Transactions.TransactionRecord
and used directly as a Trade List View in any client GUI or ResultHandler (sorted, filtered, etc.).
The required fill price averaging (depending on the grouping method chosen) is already done in the TradeBuilder
class. Please have a look at the unit tests, lots of examples there.
Step 2 will take care of obtaining the same statistics that are available now, plus all the new ones (lots of them :smile:)
Thank you for the extra reading. I like what is there so far and it looks like a pretty incredible foundation to build enhanced statistics -- without changing how we do portfolios at the moment!
One thought -- we have to plan for long running live algorithms tight on memory -- please cap the number of trades to 10,000 or 12 months, whichever is first (in live mode) -- then remove the oldest trades first. For backtests there doesn't need to be any limit. For simplicity we could do this step after the statistics (step 2) is done.
With your class we'll be able to have rolling statistics -- this will be a powerful new feature!
Agree with @mchandschuh, probably best to add Time to OrderEvent, "will be required as an input to the new AlgorithmPerformance.GenerateStatistics" -- Good idea, lets run with that and if we see a way later can "tidy" it up.
Awesome work! Keep pushing forward! :)
I just committed the first part of Step 2 with some metrics that can be calculated from a list of trades only.
There are many other stats to be added, but a quick review at this point could be useful.
This first release of the AlgorithmPerformance only has a constructor with the list of trades as an argument, but this is going to change quickly.
All calculations are performed using incremental formulas so the metrics can be efficiently updated in real time (every time a trade is closed) without requiring multiple loops over the trade list.
My idea is that this class could have a dual use: (1) Instantiated by the engine at the end of the algorithm run: to perform calculations on a complete list of trades (or filtered by symbol, period, etc.). (2) Instantiated by the algorithm base class at startup: to add trades one at a time (when closed) and make stats available to the algorithm itself during execution, for trading decisions (examples: stop trading if NumberOfLosingTrades today > 2, or TotalProfitLoss today > X). Another use case could be RealTimeStatistics for the GUI.
Thoughts or comments ?
@StefanoRaggi I do not mean to butt in here, but I have done some work with creating Trades and I hope I can offer some small insight.
I pondered quite a while how to match transactions into trades, particularly when buys and sells do not match quantities, such as with partial trades and adding to a position on nice long run. I decided on a dual stack recursive mechanism. It fulfils both of your requirements above and gives you a ScheduleD as well. Here is how it works.
My trade creator reads from a file of orders or transaction history. Then it attempts to open a position in the OpenPosition method. OpenPositions is a List
Position openPosition = OpenPositions.FirstOrDefault(p => p.Symbol == trans.Symbol); if (openPosition == null) { openPosition = OpenPosition(trans); OpenPositions.Add(openPosition); } else { ScottradeTransaction scottradeTransaction = ResolvePosition(openPosition, trans); if (openPosition.Buys.Count == 0 && openPosition.Sells.Count == 0) { OpenPositions.Remove(openPosition); } }
A Position has a Stack of Buys and a Stack of Sells.
public class Position { public int Id { set; get; } public string Symbol { get; set; } public StackBuys { get; set; } public Stack Sells { get; set; } }
OpenPositon pushs a buy transaction onto the Buys Stack and a sell transaction onto the Sell Stack if there is no position..
In ResolvePosition a buy transaction pops a sell transaction off the Sells stack and matches the two transactions up by quantity, and vice versa. If the quantities match, a trade is created. If not the transaction with the greater quantity is split into two transactions: one with a matching quantity and one with the leftovers. A Trade is created with the matching transaction and ResolvePosition is called using the leftover transaction recursively until there are no transactions on the stacks.Any leftovers are then pushed onto the appropriate stack.
This method allows for LIFO matching, which is allowed by the IRS. Replace the Stacks with Queues and you have FIFO matching. I used LIFO to extend positions to longer terms, hopefully to a year or more, for long term capital gain treatment.
My code does not account for wash sales.
My Trade class is slightly different than yours and is matched against IRS 1040 - ScheduleD format. I used the Proceeds and CostOrBasis approach, as does the IRS. Commissions on split transactions are used in the first split, so that the average cost per share in leftovers is the buy price.
public class Trade { public int Id { get; set; } public bool IsOpen { get; set; } // not used public string Symbol { get; set; } public int Quantity { get; set; } public string DescriptionOfProperty { get; set; } public DateTime DateAcquired { get; set; } public DateTime DateSoldOrDisposed { get; set; } public decimal Proceeds { get; set; } public decimal CostOrBasis { get; set; } public string AdjustmentCode { get; set; } public decimal AdjustmentAmount { get; set; } public decimal GainOrLoss { get { return Proceeds - CostOrBasis + AdjustmentAmount; } } public bool ReportedToIrs { get; set; } // Reported to IRS on 1099-B public bool ReportedToMe { get; set; } // Reported to me on 1099-B public bool LongTermGain { get; set; } public int BuyOrderId { get; set; } public int SellOrderId { get; set; } public string Brokerage { get; set; } public decimal CumulativeProfit { get; set; } }
I added a couple of fields for reporting convenience such as the BuyOrderId (from the ticket) and CumulativeProfit for summing during the year.
I would be happy to share my code if it would help. I found the stack matching recursive technique to be very quick.
@bizcad Currently I am working on the AlgorithmPerformance class, which takes as input a list of Trade objects generated by the TradeBuilder class, which already has an implementation of three types of transaction grouping and two types of transaction matching, for a total of six combinations.
The main goal of this issue is to replace the Statistics class with AlgorithmPerformance, so the Trade class currently has the minimum information required to complete this task. Of course new fields can/will be added to the Trade class, just not in this first iteration.
As for your implementation of grouping and matching, I think we have conceptually done the same thing, only in a different way. Please have a look at the TradeBuilder tests, just to check if the results are as you would expect.
@StefanoRaggi Thank you for adding the Fees to the OrderEvent. I have been wanting to get that from the order.
Refactor the Statistics class to be an instance driven, with a constructor which takes the required information and creates an instance of Statistics.
Statistics object should define all the currently generated statistics as public properties and include description parameters for the English text.
Benchmark should be separated out and implement an IBenchmark; be generated & cached in a local file once per day.
It would require two new algorithm helper methods:
Replace the statistics generation in Engine with a call to algorithm.GetStatistics, (and update the Result handlers to take a Statistics Object instead of a Dictionary<string,string>.