scrtlabs / catalyst

An Algorithmic Trading Library for Crypto-Assets in Python
http://enigma.co
Apache License 2.0
2.49k stars 725 forks source link

Ensure that external price data bundles are supported #65

Closed fredfortier closed 7 years ago

fredfortier commented 7 years ago

While we are adding more built-in exchange price data, users may want to backtest with their own. For example, someone may have purchased rare historical data from Coinigy. While this should work in theory with the regular ingest command, we have been focused on exchange bundles so this need to be re-validated and fully supported.

vonpupp commented 7 years ago

@fredfortier,

Check PR1860 which adds CSV files support, it has recently been merged.

fredfortier commented 7 years ago

@vonpupp thanks for pointing this out. It's useful.

vonpupp commented 7 years ago

Cool @fredfortier, I am working on two fronts, trying to make both catalyst and zipline to work. Currently I am not yet able to use it on zipline, most probably due to calendar issues. Since you are far more experienced in zipline than me, If you are able to use it on zipline and could help me with a minimal example on crypto I would greatly appreciate it.

I am definitively also interested in catalyst and understand more about it but since they are related, all the knowledge that I am acquiring with zipline will be helpful with catalyst later on.

Thanks a lot.

fredfortier commented 7 years ago

For now, instead of enabling a generic bundle. I'm adding a "--csv" option to the ingest-exchange command. It might look like this:

catalyst ingest-exchange -x binance --csv binance-eng_eth.csv

This way, the behavior is completely consistent with existing bundles. When we add this data set to our set of bundles, you won't have to make any change to your algo. All you have to do is provide the right attributes in your CSV which I will detail shortly.

avn3r commented 7 years ago

Can you give a small format template example of how the CSV should look like?

Thanks.

fredfortier commented 7 years ago

Yes, these columns in order without headers:

Hopefully, this should give enough flexibility to use your own data when needed while keeping the API consistent. Please let us know if you have suggestions.

On Wed, Nov 22, 2017 at 9:51 PM Abner Ayala-Acevedo < notifications@github.com> wrote:

Can you give a small format template example of how the CSV should look like?

Thanks.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/65#issuecomment-346536391, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ-Qs2gbVenU0uxw24V8YysNDQ6QcGYks5s5Qf9gaJpZM4Qd2Tx .

vonpupp commented 7 years ago

Awesome, thank you very much @fredfortier!

fredfortier commented 7 years ago

@vonpupp and @abnera I changed my mind on the columns. After experimenting with it, I found having to specify a header less annoying than having to count or re-order header-less columns.

Here are the proposed required headers:

Example attached for convenience.

bittrex_bat_eth.csv.zip

There is room for additional optional columns, to specify trading amounts and fees for example. But we can add these features separately.

fredfortier commented 7 years ago

This has been implemented, here is how you would ingest the previously attached sample csv:

catalyst ingest-exchange -x bittrex --csv ~/Data/bittrex_bat_eth.csv -f minute

We will add this info to the documentation shortly.

fredfortier commented 7 years ago

In release 0.3.9.

vonpupp commented 7 years ago

Thanks @fredfortier.

I will be testing this today. I need to convert the external data to adapt it to the format you use.

I have been given with btc_usd external data that I need to use, since it is available on your server, I wonder if I ingest it as a "different" exchange will work? Something like:

catalyst ingest-exchange -x gdax --csv mydata.csv -f minute

Furthermore, the data is already resampled on a half an hour basis, which is also the timeframe I am going to use (and also 1h and 2h). Will this whole ingestion thing work out in this scenario? My gut feeling tells me that I should ask for 1m raw data, import it as that and then resample; but if you have an alternative idea please let me know.

fredfortier commented 7 years ago

Yes, half hour data won’t work at the moment unfortunately. It has to be ether minute or daily. We are planning to add more flexible data_frequency. An arbitrary exchange name should work. On Tue, Nov 28, 2017 at 9:11 AM Vonpupp notifications@github.com wrote:

Thanks @fredfortier https://github.com/fredfortier.

I will be testing this today. I need to convert the external data to adapt it to the format you use.

I have been given with btc_usd external data that I need to use, since it is available on your server, I wonder if I ingest it as a "different" exchange will work? Something like:

catalyst ingest-exchange -x gdax --csv mydata.csv -f minute

Furthermore, the data is already resampled on a half an hour basis, which is also the timeframe I am going to use (and also 1h and 2h). Will this whole ingestion thing work out in this scenario? My gut feeling tells me that I should ask for 1m raw data, import it as that and then resample; but if you have an alternative idea please let me know.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/65#issuecomment-347594296, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ-Qhf7qPgFm0GWU_kkQpEDotnJGX_uks5s7D7GgaJpZM4Qd2Tx .

vonpupp commented 6 years ago

Hi @fredfortier,

Unfortunately I am not able to use this feature.

If I try to ingest on a fake name exchange as follows:

catalyst ingest-exchange -x csv --csv data/bittrex_bat_eth.csv -f minute

I get the following error:

Error: Invalid value for "-x" / "--exchange-name": invalid choice: csv. (choose from poloniex, bitfinex, bittrex)

If I try as you exemplified (using bittrex), I get the following message:

(.envc-unstable) > $ catalyst ingest-exchange -x bittrex --csv data/bittrex_bat_eth.csv -f minute
Ingesting exchange bundle bittrex...
[2017-12-07 17:59:12.023942] INFO: exchange_bundle: ingesting csv file: data/bittrex_bat_eth.csv

This process goes really quick so I am not sure if it really ingested the data. When I try to run a simulation with the bat_eth pair I get the following error:

[2017-12-07 18:00:34.951097] INFO: run_algo: running algo in backtest mode
[2017-12-07 18:00:35.558723] INFO: exchange_algorithm: initialized trading algorithm in backtest mode
Error traceback: /home/av/repos/customers/mdbasset/.envc-unstable/lib/python2.7/site-packages/catalyst/exchange/exchange.py (line 248)
SymbolNotFoundOnExchange:  Symbol BAT_ETH not found on exchange Bitfinex. Choose from: [u'avt_btc', u'avt_eth', u'avt_usd', u'bcc_btc', u'bcc_usd', u'bch_btc', u'bch_eth', u'bch_usd', u'bcu_btc', u'bcu_usd', u'bt1_btc', u'bt1_usd', u'bt2_btc', u'bt2_usd', u'btc_eur', u'btc_usd', u'btg_btc', u'btg_usd', u'dat_btc', u'dat_eth', u'dat_usd', u'dsh_btc', u'dsh_usd', u'edo_btc', u'edo_eth', u'edo_usd', u'eos_btc', u'eos_eth', u'eos_usd', u'etc_btc', u'etc_usd', u'eth_btc', u'eth_usd', u'etp_btc', u'etp_eth', u'etp_usd', u'iot_btc', u'iot_eth', u'iot_usd', u'ltc_btc', u'ltc_usd', u'neo_btc', u'neo_eth', u'neo_usd', u'omg_btc', u'omg_eth', u'omg_usd', u'qsh_btc', u'qsh_eth', u'qsh_usd', u'qtm_btc', u'qtm_eth', u'qtm_usd', u'rrt_btc', u'rrt_usd', u'san_btc', u'san_eth', u'san_usd', u'xmr_btc', u'xmr_usd', u'xrp_btc', u'xrp_usd', u'yyw_btc', u'yyw_eth', u'yyw_usd', u'zec_btc', u'zec_usd']

I am using latest version of catalyst installed via pip (git+https...).

Any idea, please?

Thank you very much.

fredfortier commented 6 years ago

Don’t specify-x, use two dashes —csv On Thu, Dec 7, 2017 at 10:02 AM Vonpupp notifications@github.com wrote:

Hi @fredfortier https://github.com/fredfortier,

Unfortunately I am not able to use this feature.

If I try to ingest on a fake name exchange as follows:

catalyst ingest-exchange -x csv --csv data/bittrex_bat_eth.csv -f minute

I get the following error:

Error: Invalid value for "-x" / "--exchange-name": invalid choice: csv. (choose from poloniex, bitfinex, bittrex)

If I try as you exemplified (using bittrex), I get the following message:

(.envc-unstable) > $ catalyst ingest-exchange -x bittrex --csv data/bittrex_bat_eth.csv -f minute Ingesting exchange bundle bittrex... [2017-12-07 17:59:12.023942] INFO: exchange_bundle: ingesting csv file: data/bittrex_bat_eth.csv

This process goes really quick so I am not sure if it really ingested the data. When I try to run a simulation with the bat_eth pair I get the following error:

[2017-12-07 18:00:34.951097] INFO: run_algo: running algo in backtest mode [2017-12-07 18:00:35.558723] INFO: exchange_algorithm: initialized trading algorithm in backtest mode Error traceback: /home/av/repos/customers/mdbasset/.envc-unstable/lib/python2.7/site-packages/catalyst/exchange/exchange.py (line 248) SymbolNotFoundOnExchange: Symbol BAT_ETH not found on exchange Bitfinex. Choose from: [u'avt_btc', u'avt_eth', u'avt_usd', u'bcc_btc', u'bcc_usd', u'bch_btc', u'bch_eth', u'bch_usd', u'bcu_btc', u'bcu_usd', u'bt1_btc', u'bt1_usd', u'bt2_btc', u'bt2_usd', u'btc_eur', u'btc_usd', u'btg_btc', u'btg_usd', u'dat_btc', u'dat_eth', u'dat_usd', u'dsh_btc', u'dsh_usd', u'edo_btc', u'edo_eth', u'edo_usd', u'eos_btc', u'eos_eth', u'eos_usd', u'etc_btc', u'etc_usd', u'eth_btc', u'eth_usd', u'etp_btc', u'etp_eth', u'etp_usd', u'iot_btc', u'iot_eth', u'iot_usd', u'ltc_btc', u'ltc_usd', u'neo_btc', u'neo_eth', u'neo_usd', u'omg_btc', u'omg_eth', u'omg_usd', u'qsh_btc', u'qsh_eth', u'qsh_usd', u'qtm_btc', u'qtm_eth', u'qtm_usd', u'rrt_btc', u'rrt_usd', u'san_btc', u'san_eth', u'san_usd', u'xmr_btc', u'xmr_usd', u'xrp_btc', u'xrp_usd', u'yyw_btc', u'yyw_eth', u'yyw_usd', u'zec_btc', u'zec_usd']

I am using latest version of catalyst installed via pip (git+https...).

Thank you very much.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/65#issuecomment-350046933, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ-Qk5aYtIPJWAXRQjp0zKqNQ8iQuujks5s-CgegaJpZM4Qd2Tx .

vonpupp commented 6 years ago

Thanks @fredfortier,

If by don't specify-x you mean this:

catalyst ingest-exchange bittrex --csv data/bittrex_bat_eth.csv -f minute

It doesn't work either:

Usage: catalyst ingest-exchange [OPTIONS]

Error: Got unexpected extra argument (bittrex)

I am using two dashes already on the csv option (--csv).

I believe you might not have noticed that I was trying to call the exchange as "csv" (hence -x csv) as you said the name didn't matter, but it didn't work. Then I tried to ingest the data using bittrex and it didn't work either, these are the same two scenarios I asked prior.

vonpupp commented 6 years ago

@fredfortier, it works.

It was my fault I was ingesting on one exchange while the strategy used another exchange and by mistake I didn't notice that. Being said that, it is not possible to ingest on a new exchange name like gdax or csv, just to let you know.

I apologize for the confusion.

DanielKillenberger commented 6 years ago

Hi @fredfortier

I'm trying to ingest csv data as well to do backtesting. The ingestion seems to work. But I get the following error when trying to backtest:

Requested data for trading pair XVG/BTC is not available on exchange bittrex in minute frequency at this time. Check http://enigma.co/catalyst/status for market coverage.

Even though as I said I ingested the XVG data using a csv file. Is there a way of figuring out if the data has been ingested correctly? Any ideas on how to fix the problem?

Thanks in advance!

fredfortier commented 6 years ago

I will try to replicate this today. On Sat, Feb 10, 2018 at 8:11 AM Daniel Killenberger < notifications@github.com> wrote:

@fredfortier https://github.com/fredfortier

I'm trying to ingest csv data as well to do backtesting. The ingestion seems to work. But I get the following error when trying to backtest:

Requested data for trading pair XVG/BTC is not available on exchange bittrex in minute frequency at this time. Check http://enigma.co/catalyst/status for market coverage.

Even though as I said I ingested the XVG data using a csv file. Is there a way of figuring out if the data has been ingested correctly? Any ideas on how to fix the problem?

Thanks in advance!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/65#issuecomment-364666107, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ-QudT4-7AZ8nXchYtZpZw6N6NPs43ks5tTb-ogaJpZM4Qd2Tx .

DanielKillenberger commented 6 years ago

I'm pretty sure that the issue is that the enddate of the minute data doesnt get written to the symbols.json file. If I add it manually backtesting seems to work. I'm trying to figure out where/when you actually write to the symbols.json file and add the enddate to symbols.json within the ingest_csv method.

DanielKillenberger commented 6 years ago

I think it's overwriting symbols_local.json when ingesting csv without updating it in symbols.json. So any data ingested before aren't in symbols_local.json after you ingest a new file. It should probably parse the json file and write back the new data instead of overwriting (symbols_local.json) It does so in exchange_utils.py on line 170.

Also it seems to parse symbols.json for backtesting. So it wouldn't work anyway unless there is an option to use symbols_local.json Is there even any point to the symbols_local.json file as the ingested data doesnt differentiate if it's local or not. Or does it?

fredfortier commented 6 years ago

It’s looking for each assets in symbols.json, then symbols_local.json. I’ll check to determine what’s happening exactly. Thanks for troubleshooting it. We’ve simplified this in a new released scheduled for next week. But we’ll resolve this particular issue ASAP. On Sat, Feb 10, 2018 at 10:00 AM Daniel Killenberger < notifications@github.com> wrote:

I think it's overwriting symbols_local.json when ingesting csv without updating it in symbols.json. So any data ingested before aren't in symbols_local.json after you ingest a new file. It should probably parse the json file and write back the new data instead of overwriting (symbols_local.json) It does so in exchange_utils.py on line 170.

Also it seems to parse symbols.json for backtesting. So it wouldn't work anyway unless there is an option to use symbols_local.json Is there even any point to the symbols_local.json file as the ingested data doesnt differentiate if it's local or not. Or does it?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/enigmampc/catalyst/issues/65#issuecomment-364676542, or mute the thread https://github.com/notifications/unsubscribe-auth/ABZ-Qv33vdXjdWo7zkRvjC-Qnt1fDlqsks5tTdlXgaJpZM4Qd2Tx .

DanielKillenberger commented 6 years ago

Cheers! Let me know if you need more info from my side or even better found a solution ;)

Dan733 commented 6 years ago

I have been able to successfully import custom pricing data from a separate exchange to catalyst using the $ catalyst ingest -b command, but am running into issues when backtesting.

Catalyst forces me to choose one of its three exchanges and then, when running the backtest, also trades symbols on the chosen exchange in addition to my custom bundle data.

Is there a way to force catalyst to only trade on the custom ingested bundle data?

Edit: Right now I'm considering creating a custom base currency for my data.