hootnot / oanda-api-v20

OANDA REST-V20 API wrapper. Easy access to OANDA's REST v20 API with oandapyV20 package. Checkout the Jupyter notebooks!
MIT License
398 stars 107 forks source link

Asynchronous API support #61

Closed davidandreoletti closed 7 years ago

davidandreoletti commented 7 years ago

Hi @hootnot

Thanks for the API. I am writing a Pandas DataReader to get OANDA currency data and I am using your API to contact OANDA's endpoints.

The reader is currently multithreaded but its IO performance is limited by the requests dependency used in oanda-api-v20's implementation. Would you mind/considering/providing an asynchronous API ?

requests mentions this issue in the documentation and provide solutions.

Regards,

David

hootnot commented 7 years ago

Hi @davidandreoletti

Can you provide some example code regarding the performance issue you have, the instruments you request, the frequency ... so the details that I can reproduce what you experience ?

Regards, Feite

davidandreoletti commented 7 years ago

@hootnot

the instruments you request, the frequency ... so the details that I can reproduce what you experience ?

Simply run the OANDA tests here. Each thread can issue no more than 2 requests/s at most [1] (on my single core virtual machine). My machine is in Asia and hitting a US server I believe. There is a bit of delay going on but that should not have much impact if requests were truly asynchronous. FYI: I am trying to get about 10-15 request/s as performance target.

[1] Uncomment this line to see the current number of requests/second

hootnot commented 7 years ago

@davidandreoletti

I had some issues running the code you pointed to. I alse guess it is your fx-oanda-historical branch I need from your forked repo.

Nevertheless I did some experiments on oandapyV20 using https://github.com/ross/requests-futures. I made some modifications to the example code of candle-data.py to test. It looks promissing. The async version completes in about half of the time.

I have created a branch async-requests-futures and added the modified candle-data example.

Please clone it and install it locally, maybe in a separate virtualenv to be sure. Check the candle-data-par.py example for the details. I think most simple thing to do is to isolate the problem by creating a small script that only does the (parallel) requests that you want. The candle-data-par.py example is probably a good start since you make the same kind of requests. Please let me know if this helps you tackling your problem.

I will see if/how/when to implement async. on oandapyV20.

(adrenv) hootnot@oatr:~/oanda-api-v20/examples $ time python src/candle-data.py --i EUR_USD --i EUR_JPY --i EUR_AUD --i EUR_GBP --i DE30_EUR --i EUR_ZAR --i US30_USD --i NAS100_USD --count 5000 --nice --gr M5 >out2

real    0m20.875s
user    0m4.636s
sys 0m0.540s
(adrenv) hootnot@oatr:~/oanda-api-v20/examples $ time python src/candle-data-par.py --i EUR_USD --i EUR_JPY --i EUR_AUD --i EUR_GBP --i DE30_EUR --i EUR_ZAR --i US30_USD --i NAS100_USD --count 5000 --nice --gr M5 >out

real    0m9.757s
user    0m4.768s
sys 0m0.696s
(adrenv) hootnot@oatr:~/examples $ wc out out2
  440048   760080  7825682 out
  440048   760072  7825650 out2
  880096  1520152 15651332 total
davidandreoletti commented 7 years ago

@hootnot I looked a bit more about the infamous Python GIL issue (especially IO/CPU issues) and how it affects IO/CPU bound multithreaded applications. Each pure python thread must acquire the GIL to run and a thread waiting for IO automatically release the GIL allowing another python thread to run. The thread owning GIL runs, every other threads waits.

Basically for me, it is important that each IO bound thread issue as many async requests as possible (up to max 15 reqs/s) before it releases the GIL once it begins blocking on the async requests to get the response with my_future.results().

I digged further down into the future_requests package and I found that it is basically doing I am already doing:

So at first glance, it seems not to be a good fit.

hootnot commented 7 years ago

@davidandreoletti Yes the GIL issue. Everyone doing threading runs into it at some moment ...

The ratelimiting is set to 30 requests/second and is IP bound. So regardless the success of the threading solution you choose, this is the hard limit. Threading must respect this limit also. It seems to me that you can only experience this problem when you retrieve a lot of instruments with a high frequent granularity . Can you share me some parameters ?

Have you taken a look at my other repo: oanda-trading-environment. It reads the pricestream and bakes candle-records of your choice. It offers 0MQ pub/sub. You could let it write candle records of you choice into https://redis.io/ and query REDIS with your application. Redis does also pub/sub. (redis has python bindings). Maybe oanda-trading-environment can be a solution for your problem

davidandreoletti commented 7 years ago

@hootnot

About the 30 reqs/s:

The official documentation says:

To provide equal resources to all clients, we recommend limiting both the number of new connections per second, and the number of requests per second made on a persistent connection (see above/below).

For new connections, we recommend you limit this to once per second (1/s). Establishing a new TCP and SSL connection is expensive for both client and server. To allow a better experience, using a persistent connection will allow more requests to be performed on an established connection.

For an established connection, we recommend limiting this to fifteen requests per second (30/s). Please see following section on persistent connections to learn how to maintain a connection once it is established.

As per the documentation, it seems possible to create several established connection (at a rate of 1/s) and have up to 30 req/s per established connection.

About 0MG pub/sub:

I am writing a datareader for pydata/pandas-datareader and unfortunately REDIS/0MQ are out of the picture. Nonetheless, I keep this good idea in mind :)

hootnot commented 7 years ago

@davidandreoletti It was not clear to me what your goal exactly was. The datareader could also be a spin-off of something you try to accomplish. Maybe I will look into the async connection during X-mas time