trendmanagement / Tmqr-framework-2

3 stars 0 forks source link

Alex - Task - Advise on how to speed up V2 Exo and Alpha Calculations, by Thursday Nov 2 #79

Open spickering-git opened 6 years ago

spickering-git commented 6 years ago

@alexveden The script to calculate exos and alphas is a loop that checks each exo of each instrument every minute for whether it needs to be calculated and if it TRUE exo is calculated. Also checks whether the alphas that contain that exo need to be calculated and if TRUE alpha is calculated. https://github.com/trendmanagement/Tmqr-framework-2/blob/9b48a16244a152c5d45ab2eb61ea33d6c2eb6f13/tmqrscripts/run_indexes/run_indexes.py#L116

I turned off the verbosity, hoping that would help a bit. using the following: log.disabled = True

Also turned off the slack messaging hoping that would speed things up slightly.

Can you make some recommendations on speeding up the process of exo and alpha calculations?

We are finding that some of the alphas are finishing their calculations after the execution time.

alexveden commented 6 years ago

Also turned off the slack messaging hoping that would speed things up slightly.

yeah, probably you could get 0.5% of speedup

My suggestions: 1. Try to reduce DB calls. Like: https://github.com/trendmanagement/Tmqr-framework-2/blob/9b48a16244a152c5d45ab2eb61ea33d6c2eb6f13/tmqrscripts/run_indexes/run_indexes.py#L112 https://github.com/trendmanagement/Tmqr-framework-2/blob/9b48a16244a152c5d45ab2eb61ea33d6c2eb6f13/tmqrscripts/run_indexes/run_indexes.py#L320 https://github.com/trendmanagement/Tmqr-framework-2/blob/9b48a16244a152c5d45ab2eb61ea33d6c2eb6f13/tmqrscripts/run_indexes/run_indexes.py#L339 https://github.com/trendmanagement/Tmqr-framework-2/blob/9b48a16244a152c5d45ab2eb61ea33d6c2eb6f13/tmqrscripts/run_indexes/run_indexes.py#L445

2. Get only required fields. MongoDB stores all information about Indexes and alphas, including equities series and statistics.
For example this line: https://github.com/trendmanagement/Tmqr-framework-2/blob/9b48a16244a152c5d45ab2eb61ea33d6c2eb6f13/tmqrscripts/run_indexes/run_indexes.py#L320

You are pulling all information about alphas. But pull only document fields you need. Check the PyMongo docs for .find() and .find_one() methods (projection parameter):

projection (optional): a list of field names that should be returned in the result set or a dict specifying the fields to include or exclude. If projection is a list “_id” will always be returned. Use a dict to exclude fields from the result (e.g. projection={‘_id’: False}).

3. Do caching Instead of polling the DB each time fill the cache at the script start and use it for metadata.

4. Use account position process only once per script lifetime Execution manager calculates all accounts/instruments/campaigns per run, you don't need to execute it each time the alpha is calculated.

https://github.com/trendmanagement/Tmqr-framework-2/blob/9b48a16244a152c5d45ab2eb61ea33d6c2eb6f13/tmqrscripts/run_indexes/run_indexes.py#L466

5. Use line profiler The line_profiler is very useful tool to managing bottlenecks, it's already installed into V2 notebook server. So you can run it just right in the notebook. Here is the manual: http://mortada.net/easily-profile-python-code-in-jupyter.html

6. Separate by process Threading is a bad idea for Python in general. This is well-known pain that called Global Interpreter Lock https://wiki.python.org/moin/GlobalInterpreterLock. Try to run in separate scripts.

spickering-git commented 6 years ago

@alexveden Task, troubleshoot untimeliness of exo and alpha scripting.

In analyzing performance of the V2 scripts with this https://10.0.1.2:8889/notebooks/V2_script_performance_analysis.ipynb

I found that for example running both EXOs and Alphas together for US.6J it takes about 60 seconds.

I will work on improving that by starting instruments on separate scripts from the main script. But that doesn't explain why the V2 alphas are finishing their calculations at ~ 10:34 https://tmqrexo.slack.com/archives/G3GPFV878/p1509644064000642

the 6J decision time is image

So back tracking through watchdog bot https://tmqrexo.slack.com/archives/G3GPFV878/p1509644039000079 https://tmqrexo.slack.com/archives/G3GPFV878/p1509643976000531

and the alpha logs http://10.0.1.2:8080/alphas/generic_alpha_exo_stdout.log image

Possibly due to data delays? Also, system could be delayed because of calculation overlaps in instruments.

@alexveden please have a look at the watchdog bot messages and the logs. You will probably have some insight into the problem.

alexveden commented 6 years ago

All of the bars arrived right on time: DataFeed: [RUN] New bar 2017-11-02 10:26:00 DataFeed.6A DataFeed: [RUN] New bar 2017-11-02 10:26:00 DataFeed.GC DataFeed: [RUN] New bar 2017-11-02 10:26:00 DataFeed.6J DataFeed: [RUN] New bar 2017-11-02 10:26:00 DataFeed.6C

V1 Takes about 5 minutes to run: ExoEngine: [RUN] EXOs processed for GC at 2017-11-02 10:26:00 (CalcTime: 248.15s) ExoEngine.GC ExoEngine: [RUN] EXOs processed for 6J at 2017-11-02 10:26:00 (CalcTime: 288.41s) ExoEngine.6J

Also, system could be delayed because of calculation overlaps in instruments.

I have 3 hypothesizes:

  1. Too many scripts run simultaneously (out of CPU)
  2. Memory is full
  3. Disk high load
alexveden commented 6 years ago

Another idea about speeding up V2.

This is related to almost all alphas which use Continuous Futures, and all AlphaV1HedgeWithIndex alphas.

The cont fut building process takes about 2 seconds, and if we have about 100 alphas, it will take 200 seconds. This is a bottleneck.

Here is profiler: image

To fix this we should add start_date=datetime(2016, 1, 1) to the DataManager init line (for online scripts).

warning this might affect on some alphas which are sensitive to the historical order, like DSP alphas.