QuantConnect / Lean

Lean Algorithmic Trading Engine by QuantConnect (Python, C#)
https://lean.io
Apache License 2.0
9.58k stars 3.23k forks source link

Add Support for Algorithm Parameter Inputs (Configurable variables) #96

Closed mattycourtney closed 8 years ago

mattycourtney commented 9 years ago

Enable finding the optimal set of parameters to trade an algorithm with

gururise commented 9 years ago

I'm thinking parameter optimization could be easily supported by allowing optional parameters to be passed into the Engine instance when it first starts. Once the backtest is complete, the results recorded, and another instance with new parameters is started. This whole process would occur in a loop under some sort of user control. By putting this process in a loop, the user could then either do some type of simple parameter optimization via brute-force, or utilize a GA or some other form of Evolutionary Programming.

mchandschuh commented 9 years ago

I think the concept of parameter optimization is slightly outside of the scope of QuantConnect.Lean.Engine.csproj code. I think what we really need are parameterized inputs for the algorithm and for a job to be able to accept/pass these parameterized inputs. In this way, the infrastructure that runs Lean can perform the parameter optimization search, whether it be the QC cloud infrastructure or a local desktop Lean runner.

I think specifying the parameters in the algorithm code can be accomplished rather cleanly through the usage of C# attributes:

public class MyParameterizedAlgorithm : QCAlgorithm
{
    [IntParameter(Minimum: 3, Maximum: 50)]
    public int SimpleMovingAveragePeriod = 14;

    // indicator instantiated using a parameter
    public SimpleMovingAverage MySMA;

    public override void Initialize()
    {
        //...
        MySMA = SMA("SPY", SimpleMovingAveragePeriod);
    }
}

This would allow the infrastructure to inspect the algorithm to find parameters very easily, as well as display them to a desktop UI if needed.

Once parameterized inputs for an algorithm are implemented, then we could implement the optimization feature by running multiple lean instances and setting the algorithm parameters based on the optimization algorithm, be it brute force or some evolutionary algorithm.

Thoughts?

Vincent-Chin commented 9 years ago

Don't know if anyone else is working on this; I'm working on an implementation (including a brute-forcer proof-of-concept). I expect to submit a patch around the end of this week.

jaredbroad commented 9 years ago

Awesome @TangentLogic! Look forward to seeing your design. Are you using C# attributes like demonstrated above?

From our past experiences its best to talk through the design publicly as early as possible to get the benefit of the "hive mind" :)

Vincent-Chin commented 9 years ago

Yeah, it's mostly similar to mchandshuh's post, where parameters can only be declared on fields. Right now I've built 4 parameter types: IntParameter, DecimalParameter, DateTimeParameter, and BoolParameter.

IntParameter, DecimalParameter, and DateTimeParameter have a Min, Max, and Step. Step is used for incrementing between min and max during a brute-force optimization.

BoolParameter only has a Min and Max. Min is always false, Max is always true.

bizcad commented 9 years ago

@Vincent-Chin I found some code for parsing command line parameters if that would be any help. http://www.codeproject.com/Articles/3111/C-NET-Command-Line-Arguments-Parser I was able to adapt it to my project and found it very convenient.

Nick

Vincent-Chin commented 9 years ago

By way of update,

The code is still in-progress. I've had to replace DecimalParameter with DoubleParameter due to CLR limitations (decimals are invalid parameter arguments because they are not a 'core' data type.)

I've gotten the parameterization working; currently am writing the unit tests and trying to figure out why the brute forcing logic I've created fails after the first iteration (something's being disposed - I don't fully understand the Engine class yet.)

Have to revise my ETA to end of this week/weekend.

jaredbroad commented 9 years ago

Sweet Vincent!!

Shame about the DecimalParameter - this will be a very common parameter request. Perhaps @mchandschuh will have some ideas how to make this one possible? (once you push to your local repo we'll do some initial review / hints)

I'd recommend just submitting pull-request with singular pairing of [Engine]-[Parameter-Set] first. We'll have to make a separate "Engine-Optimizer" project which deploys hundreds of engine instances and passes the parameter-files into the engine instances. This way it will be compatible with our cloud / QuantConnect.com as well.

Please also consider submitting the parameter requirements as an array in the job packet (list of parameter objects?)

jaredbroad commented 9 years ago

Edit: Renamed "Parameter File" for "Parameter Set" since its really going to be in the algorithm-job class. Its not a real "file" and just the values of parameters for a specific backtest run.

mchandschuh commented 9 years ago

@Vincent-Chin you're correct that attribute's can't embed decimal data via named parameters, but maybe we embed the data as double? Since these are Min and Max values, I think a double would express that just as well. We need to have the parameter itself be a decimal, but the min/max/step could be double, and maybe another overload ctor makes step a ratio int/int to (thinking of 1/10 which is not representable in base2 floating point doubles).

@jaredbroad - agreed, first step is to get the parameterization work complete and into master so we can all test it out, second step would be to start coming up with the algorithms used to perform the optimization... these algorithms would be defining what values should be back tested.

@Vincent-Chin - if possible, please push your local code to github so we can start to take a peak! I'm getting filled with anticipation waiting to see the new code! :D

Vincent-Chin commented 9 years ago

If you guys don't care about me completing the unit tests / integrating smoothly with the rest of the project, I can push what I have tonight, after I get home.

mchandschuh commented 9 years ago

@Vincent-Chin - Please do continue work on the unit tests and such, I'm just anxious to see some code, so if you could push what you have this evening that would be wonderful. Thanks!

Vincent-Chin commented 9 years ago

Unfortunately I'm not very experienced with Git and have not been able to set up a Pull Request properly; (the fork I attempted to create was some 53 commits behind the master, for some reason.)

So here's a patch: https://dl.dropboxusercontent.com/u/9232427/0001-Draft-of-Parameterization.-Works-on-parameter-defini.patch

It's using a modified config.json that attempts to do a brute force optimization. Fails (hangs) on the second iteration, but it is extracting and assigning different parameter permutations properly.

Vincent-Chin commented 9 years ago

Note that there's a bug in lines 453-456 of the patch; the values should be casting to (decimal). Instead they're casting to (int).

mchandschuh commented 9 years ago

You can push to your fork's master using the following:

$ git push origin master

Also, here's a link to the contributor's guide

chrisdk2015 commented 8 years ago

I tried to do a genetic optimization using the lean engine but it hangs on second run like Vincent-Chin is saying.

I looked around and somewhere an order gets status response error and hangs the engine.

In the terminal it just says Isolator.ExecuteWithTimeLimit a few times.

Setting the engine instance to null and making a new instance doesn't help.

Maybe there is something I am missing?

You can test it out from here

https://github.com/chrisdk2015/LeanOptimization

I just used Config.Get and Set for passing the variables.

mchandschuh commented 8 years ago

Without actually looking into it, I would recommend completely reinitializing everything from scratch. This means for each lean engine create new instances of the system and algorithm handlers. In order to do this you'll actually need to not use the FromConfiguration which uses the Composer which will cache the results. Instead you can 'new up' the handlers directly, or what I would do, I would launch each lean engine instance within its own AppDomain.

chrisdk2015 commented 8 years ago

Ok, I have something that works now that I just committed.

https://github.com/chrisdk2015/LeanOptimization

More hacky than clean but it works.

However on my machine, old laptop with Intel mobile CPU U7300 @ 1.30 GHz running Ubuntu 14.04 what slows the optimization down is the step from exiting the algorithm manager, waiting for the threads to exit and until the next engine is up and running.

There's a approx. 40 second delay there as you can see from the log:

20151022 09:30:34 Trace:: Engine.Run(): Exiting Algorithm Manager
20151022 09:30:35 ERROR:: StatisticsBuilder.EnsureSameLength(): Padded Performance
20151022 09:30:42 Trace:: FileSystemDataFeed.Exit(): Exit triggered.
20151022 09:30:42 Trace:: BrokerageTransactionHandler.Run(): Ending Thread...
20151022 09:30:42 Trace:: Waiting for threads to exit...
20151022 09:31:12 Trace:: Engine.Main(): Analysis Completed and Results Posted.
20151022 09:31:12 Trace:: FileSystemDataFeed.Exit(): Exit triggered.
Running algorithm with value: 33
20151022 09:31:13 Trace:: Config.Get(): Configuration key not found. Key: plugin-directory - Using default value: 

Maybe you haven't noticed this on faster machines.

I also noticed this on single backtests using the ordinary engine creation logic.

bizcad commented 8 years ago

@chrisdk2015

Good work.

While my machine is not the fastest around it is a 64 bit Win10 and is faster than yours. I cloned your program and after some fiddling to add Accord, got it to compile and run. As I am sure you know, it hung up for 30 seconds or so in Engine.cs at

                //Wait for the threads to complete:
                var ts = Stopwatch.StartNew();
                while ((_algorithmHandlers.Results.IsActive 
                    || (_algorithmHandlers.Transactions != null && _algorithmHandlers.Transactions.IsActive) 
                    || (_algorithmHandlers.DataFeed != null && _algorithmHandlers.DataFeed.IsActive)
                    || (_algorithmHandlers.RealTime != null && _algorithmHandlers.RealTime.IsActive))
                    && ts.ElapsedMilliseconds < 30*1000)
                {
                    Thread.Sleep(100);
                    Log.Trace("Waiting for threads to exit...");
                }

After some poking around, I discovered that you are using the DesktopResultsHandler. I figured it was so you could avoid all that text scrolling by in the console window and just show the genes as they are processed. I think there must be a problem with the DesktopResultsHandler, maybe because you have no desktop.

I changed things around to the ConsoleResultsHandler by going back to the LeanEngineAlgorithmHandlers that they originally used.

            LeanEngineAlgorithmHandlers leanEngineAlgorithmHandlers;
            try
            {
                leanEngineAlgorithmHandlers = LeanEngineAlgorithmHandlers.FromConfiguration(Composer.Instance);
                _resultshandler = leanEngineAlgorithmHandlers.Results;
            }
            catch (CompositionException compositionException)
            {
                Log.Error("Engine.Main(): Failed to load library: " + compositionException);
                throw;
            }

It annoyingly blurs by in the console window, but at least it does not stall. I also added the Composer for the CompositeLogHandler, so everything goes to the log.txt file. I also added to the end of EMATest.cs

        public override void OnEndOfAlgorithm()
        {
            Log(string.Format("\nAlgorithm Name: {0}\n Symbol: {1}\n Ending Portfolio Value: {2} \n Start Time: {3}\n End Time: {4}", this.GetType().Name, symbol, Portfolio.TotalPortfolioValue, starTime, DateTime.Now));
            #region logging
            #endregion
        }

I initialize starTime in the Initialize() function.

I pushed the changes to a repo at https://github.com/bizcad/LeanOpto.git

As an added bonus I have written a log file parser that turns the STATISTICS:: into a csv file. Clone it here: https://github.com/bizcad/ParseQCLogFile.git

Nick Stein nicholasstein@cox.net

chrisdk2015 commented 8 years ago

Thanks Nick,

I have created a new issue so not to pollute this issue

https://github.com/QuantConnect/Lean/issues/190

Added your suggestions to the code and now my PC runs at 100% CPU now which means it works as it is supposed to without delays.

Also multiple variables are supported now.

Note: the operator I have for crossing over has not been fully tested.

jaredbroad commented 8 years ago

Implemented in 2015