softwarespartan / IB4m

Interactive Brokers API for Matlab
GNU General Public License v2.0
62 stars 21 forks source link

How to make sure old buffer objects are removed #77

Closed ecpgieicg closed 4 years ago

ecpgieicg commented 4 years ago

Hi Abel,

I keep getting java.lang.OutOfMemoryError: Java heap space error from making historical data requests.

How do I make sure old data in buffers that are no longer in used are freed from taking up memory?

For example, after making the following historical data requests

session = TWS.Session.getInstance();
session.eClientSocket.eConnect('127.0.0.1',7497,2);

[buf,lh] = TWS.initBufferForEvent(TWS.Events.HISTORICALDATA);
contract = com.ib.client.Contract();
contract.symbol("VIX");contract.secType("OPT");contract.exchange("SMART");contract.currency("USD");contract.lastTradeDateOrContractMonth("20191119");contract.strike(30);contract.right("Call");

session.eClientSocket.reqHistoricalData(101,contract,'20191029 16:00:00','1 W','1 min','BID',1,1,false,[]); pause(0.5);
session.eClientSocket.reqHistoricalData(102,contract,'20191022 16:00:00','1 W','1 min','BID',1,1,false,[]); pause(0.5);
session.eClientSocket.reqHistoricalData(103,contract,'20191015 16:00:00','1 W','1 min','BID',1,1,false,[]); pause(0.5);
session.eClientSocket.reqHistoricalData(104,contract,'20191008 16:00:00','1 W','1 min','BID',1,1,false,[]); pause(0.5);

If buf and lh are simply overwritten in Matlab for a new request, over time, I get the memory error.

In order to fully release the memory after outing the data in `buf, is it sufficient to do something like (although I haven't tried):

for i in numel(lh); delete(lh{i}); end

or is there a method similar to delete for buf as well?

softwarespartan commented 4 years ago

Your buf is listening for contract details not historical data.

How many data are returned in each historical data request?

What version of Matlab are you using?

What is the current size of your heap memory?

Are you using most recent version of IB4m?

On Nov 4, 2019, at 11:06 AM, ecpgieicg notifications@github.com wrote:

Hi Abel,

I keep getting java.lang.OutOfMemoryError: Java heap space error from making historical data requests.

How do I make sure old data in buffers that are no longer in used are freed from taking up memory?

For example, after making the following historical data requests

session = TWS.Session.getInstance(); session.eClientSocket.eConnect('127.0.0.1',7497,2);

[buf,lh] = TWS.initBufferForEvent(TWS.Events.CONTRACTDETAILS); contract = com.ib.client.Contract(); contract.symbol("VIX");contract.secType("OPT");contract.exchange("SMART");contract.currency("USD");contract.lastTradeDateOrContractMonth("20191119");contract.strike(30);contract.right("Call");

session.eClientSocket.reqHistoricalData(101,contract,'20191029','1 W','1 min','BID',1,1,false,[]); pause(0.5); session.eClientSocket.reqHistoricalData(101,contract,'20191022','1 W','1 min','BID',1,1,false,[]); pause(0.5); session.eClientSocket.reqHistoricalData(101,contract,'20191015','1 W','1 min','BID',1,1,false,[]); pause(0.5); session.eClientSocket.reqHistoricalData(101,contract,'20191008','1 W','1 min','BID',1,1,false,[]); pause(0.5); If buf and lh are simply overwritten in Matlab for a new request, over time, I get the memory error.

In order to fully release the memory after outing the data in `buf, is it sufficient to do something like (although I haven't tried):

for i in numel(lh); delete(lh{i}); end or is there a method similar to delete for buf as well?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

ecpgieicg commented 4 years ago

Your buf is listening for contract details not historical data.

That's a typo when creating this post. Corrected now.

How many data are returned in each historical data request?

Up to 22k single ticks at a time before re-creating a new buffer object in Matlab. Memory problem seems particularly prominent when my script has to wait for IB server for a long time and thus have to re-connect to API and re-initialize the buffers in order to make a clean query -- say after server restart, IB gateway restart, network issues, etc -- after a few iterations such these events, it will run into the memory error and Matlab will crash.

What version of Matlab are you using?

R2019a

What is the current size of your heap memory?

2Gb

Are you using most recent version of IB4m?

Yes

Despair2000 commented 4 years ago

Have you tried clearing the buffer after you retrieved the data? Try issuing a

clear(buf)

softwarespartan commented 4 years ago
  1. Please pull latest updates. I have pushed an update to GitHub which should/will provide better support memory management in the TWS message handler.
  2. remove object from your buffer once you consume it. If not removed, JVM can't garbage collect it.
  3. please increase your heap size in MATLAB

Object lifetime is difficult to predict so definitely pull updates, clear queues/buf, and increase your maximum available heap memory in MATLAB.

Despair2000 commented 4 years ago

@softwarespartan

I pulled your recent update but it generates the following error:

Unrecognized function or variable 'getEventWithRemove'.

softwarespartan commented 4 years ago

did you load the updated jar file?

Despair2000 commented 4 years ago

Yes, I did.

softwarespartan commented 4 years ago

@Despair2000 can you confirm you see getEventWithRemove when request methods from TWS.Session handler

methods(session.handler)

Despair2000 commented 4 years ago

No, I do not see the method. I will double check now that the JAR-file is correct although I'm pretty sure it should.

Despair2000 commented 4 years ago

I checked and the JAR-file is the version you uploaded today. Did you maybe accidentally push the old jar-file?

softwarespartan commented 4 years ago

Strange -- apologies for the difficulty. I double checked and I pushed the correct jar file. You might need to clear all or restart MATLAB.

I downloaded the jar file from the GitHub repo, loaded it, and I do see the getEventWithRemove there.

Despair2000 commented 4 years ago

Restarting MATLAB did the trick. Now the method shows up. Sorry, I should have tried this before posting that it doesn't work.

ecpgieicg commented 4 years ago

@softwarespartan

Hi Abel,

I downloaded the latest IB4m version. Thank you for the update!

I added clear buf lh at the end of each request as well as cellfun(@delete, lh) before it. Do I need to directly call the session.handler.removeHistoricalDataListener method or does clear/delete invoke the method already?

And indeed I have been using buf.remove.data to read data returns in the buffers. It seems JVM has had difficulty garbage collecting nevertheless. (The frozen Matlab instance with the memory error would continue to consume as much CPU resource as Windows lets it, presumably struggling with garbage collection.)

As for increasing heap memory size, I am actually running out of system memory on the machine I use to pull data from IB. Hopefully 2Gb will be enough after the update. ><

You mention clear queues. Do you mean cancel existing data requests with IB Gateway + IB server when restarting/repeating a query?

softwarespartan commented 4 years ago

All things considered, it is a lot of data and 2 GB is not a lot of memory. My laptop from 2014 has 16 GB RAM and I run MATLAB with 8 GB max heap. My desktop machine has 256 GB of RAM. If you really want to tightly manage memory, MATLAB is not the right platform.

Listener handles (lh) and buffers (buf) are not consuming memory. The data items in the buffer (returned by TWS) are what consume memory. Clearing a listener handle will not help. Similarly, a data request itself does not consume memory, rather it is the data returned by the request.

Instead of using get to retrieve an item from the queue (buf) you can alternatively use event=buf.remove() to get the next event in the buffer. This ensures that items/events are removed upon retrieval.

If MATLAB still freezes then you need to either increase your max heap size or use less data.

And, be very sure to understand that you don't have to put the events in a queue at all. You can just process them as they arrive and let them evaporate. Just define a different callback function. There are many examples of this already in IB4m and also lots of MATLAB docs:

https://www.mathworks.com/help/matlab/matlab_oop/learning-to-use-events-and-listeners.html

ecpgieicg commented 4 years ago

I have been using .remove and retrieve as data arrive.

My next question is: can I be confident that Matlab/JVM will release the memory occupied by the output of .remove?

Say, I do something like

bars={};
bars=[bars; collection2cell(buf.remove().data)];

Whatever buf.remove creates is also a Java object. Will that be cleared at the end of the line?

The reason I don't have more memory is that I am running multiple instances of Matlab in order to retrieve historical and real time data from IB. It seems each username with IB has a hard cap of bandwidth when downloading historical data (and a cap of simultaneous subscriptions of real time data). That max bandwidth cap is only attained with 2 separate API clients for the same username. So that makes at least 2 Matlab instances per username whenever I request data. It'd take too much time to download anything useful otherwise. And that's why I am splitting available RAM to the different Matlab instances.

softwarespartan commented 4 years ago

I don't think so.

The most you can do is not have any active references to an object. The JVM runs garbage collection whenever it wants. You can add additional arguments to the JVM to configure/tune the gc but that is expert only territory and can easily induce other unwanted side effects.

Latest update should help a lot on the memory management. Keep me posted.

ecpgieicg commented 4 years ago

And to be sure, each tick data as stored in the single hash set buf.remove.data should only be about 40 bytes per tick right? -- summing up reqId, dateStr, high, low, open, close, vol. Even at 100 bytes, 22k ticks is only 22Kb. (22k is 8 weeks for 1min ticks.) My concern is that if the output of buf.remove is not cleared from memory, regardless of the heap size, Matlab will still crash after some time. (I collect data continuously.)

Again, thank you for the update. I will keep you posted.

ecpgieicg commented 4 years ago

@softwarespartan

Early update: judging by the sizes of working sets for my Matlab instances, which has always been >1.2Gb each after a few hours and growing but now it is only 0.75Gb and it's staying constant as more requests are made, I think the update worked. Also, with the update, the manual issues of clear in Matlab like clear buf and temp=buf.remove; do_stuff(temp) ; clear temp make no difference.

softwarespartan commented 4 years ago

Great!

ecpgieicg commented 4 years ago

@softwarespartan @Despair2000 An update. I have more or less run IB4m continuously since the last time we spoke. (There were interruptions that were caused by IB Gateway.) And there hasn't been a Java heap memory error since. As well, even though Matlab is allowed to get 2Gb memory for Java and more for itself, it's never used more than 1.6Gb while running IB4m and making historical data requests. So the update worked as intended. Thank you, Abel!

Also curiously, on Win 7, Matlab+IB4m would take up 1.2Gb in average per instance but only 0.7Gb on Win 10.