quantopian / zipline

Zipline, a Pythonic Algorithmic Trading Library
https://www.zipline.io
Apache License 2.0
17.49k stars 4.71k forks source link

can't get minute mode to work (still) #2383

Open auwsom opened 5 years ago

auwsom commented 5 years ago

Dear Zipline Maintainers,

using 3.5 base conda environment on command line (and in Jupyter Lab) have daily algos working, but wanted to test my order api on some minutely triggers using basic algo that just records data: record(QQQ = data.current(context.asset, "close"))

Description of Issue

As the title says, "can't get minute mode to work (still)", and I've tried everything I could to get different results. Similar to here: https://github.com/quantopian/zipline/issues/1014, and elsewhere.

I finally thought I overcame the benchmark issue by copying the same minute data into SPY_benchmark.csv and removing columns and header. That threw no fatal errors, is not attempting to ingest benchmarks (so I'm assuming it is using my replaced ones), but I'm still getting a daily output perf.

Is the minute testing not working in general, or are there others who have it working? I had the same problem a year or so and gave up, and have spent most of this weekend banging my head against it.

Thank you

zipline run -f mtest.py --start '2018-12-03' --end '2018-12-07' --data-frequency minute

C:\Users\username\Anaconda35\lib\site-packages\empyrical\stats.py:704: RuntimeWarning: invalid value encountered in true_divide out=out, C:\Users\username\Anaconda35\lib\site-packages\empyrical\stats.py:790: RuntimeWarning: invalid value encountered in true_divide np.divide(average_annual_return, annualized_downside_risk, out=out) [2018-12-10 05:36:29.606680] INFO: zipline.finance.metrics.tracker: Simulated 5 trading days first open: 2018-12-03 14:31:00+00:00 last close: 2018-12-07 21:00:00+00:00 QQQ algo_volatility algorithm_period_return \ 2018-12-03 21:00:00+00:00 NaN NaN 0.0 2018-12-04 21:00:00+00:00 NaN 0.0 0.0 2018-12-05 21:00:00+00:00 NaN 0.0 0.0 2018-12-06 21:00:00+00:00 NaN 0.0 0.0 2018-12-07 21:00:00+00:00 NaN 0.0 0.0 .............................

Update: I tried entering one line for the algo which includes a minutely setting value and received the below. This is the same "Key Error: 0" as in the issue linked above from 2016. I haven't found the resolution for it. There was also the last line in the error below. Also is the head 5 of my sid 0 data csv. And finally the response from ingest from my custom minute bundle.

ma = data.history(context.asset, 'close', 2, '1m').mean()

Traceback (most recent call last): File "C:\Users\username\Anaconda35\lib\site-packages\zipline\data\minute_bars.py", line 1073, in _open_minute_file carray = self._carrays[field][sid] KeyError: 0 ... zipline.data.bar_reader.NoDataForSid: No minute data for sid 0.

"date,open,high,low,close,volume 2018-11-30 09:31:00,168.4,168.42,168.33,168.35,1249207.0 2018-11-30 09:32:00,168.34,168.36,168.22,168.24,107096.0 2018-11-30 09:33:00,168.25,168.25,167.98,167.98,210429.0 2018-11-30 09:34:00,167.97,167.99,167.84,167.99,184980.0 2018-11-30 09:35:00,168.02,168.02,167.77,167.77,103920.0"

C:\Users\username\Jupyter>zipline ingest -b custom-csvdir-bundle Loading custom pricing data: [####################################] 100% | QQQ: sid 0 Merging minute equity files: [------------------------------------] 0 Merging minute equity files: [####################################]

CapitalZe commented 4 years ago

I'm in the same position. I can successfully ingest my custom minute data, but no sqlite file is generated.

I think that is the bulk of the problem. It then always fails or refers elsewhere to run the back test.

CapitalZe commented 4 years ago

Any progress made on this?

CapitalZe commented 4 years ago

May I ask your source for custom data? Was it Duckascopy?

I am using it from there (with some trials with finam data) and facing the exact same issue.

I think it is something to do with pandas not being able to convert the string to a timestamp. If you run a !zipine bundle (if using Jupyter NB) you will likely see the throwback: ValueError: could not convert string to Timestamp

I think this means the dates are being parsed correctly by the time data is not, hence the no minute data for sid, not anything to do with the benchmark data per se. This is speculation though.

If anyone has any input on this, please share. It's a nightmare trying to get minute data to work.

I also got fairly close using Sentdex tutorial wherein he relies on Pandas panels, but this method has me closer.

rajach commented 4 years ago

Did you find a solution for this? First I was getting a KeyError: Timestamp('2019-11-29 13:43:00+0000', tz='UTC') and I loaded the file to dataframe applied tz_localize(tz='US/Eastern') and saved it back. Then the error changed to loading zero percent.

CapitalZe commented 4 years ago

No, I am getting a KeyError: Timestamp('1990-01-02 00:00:00+0000', tz='UTC') What is strange is that I do not reference that date at all, and I cannot trace where it is. My date ranges are flagged as 2017 to 2020.

Sorry I cannot be of more help.

tstevens02127 commented 4 years ago

I've been trying to run minute-level backtests with some issues. I've got it to work now but my output has a strange quality. Even though I have minute level input data:

2020-05-08 09:44:00+00:00 2020-05-08 09:45:00+00:00 2020-05-08 09:46:00+00:00

My output zeros out everything but the day, tossing the hour and minute detail out. So, for a given trading day, I've got a series of +400 lines (+400 minutes in a trading day) of results that all share the same timestamp (that day's date). Is this an issue that you encountered? What part of this process could lead to this? Output:

2020-05-08 00:00:00+00:00 2020-05-08 00:00:00+00:00 2020-05-08 00:00:00+00:00 @CapitalZe, @auwsom , if you still can't see the minute level detail, I'm getting it by saving the results of the run_algo to a csv. But again, the datetime is stripped of it's hour/minute detail which I'm trying to fix. Appreciate any insight you might have