connamara / quickfixn

QuickFIX/n implements the FIX protocol on .NET.
http://quickfixn.org
Other
471 stars 560 forks source link

OutOfMemoryException from FileStore over time #685

Open rmay-tws opened 3 years ago

rmay-tws commented 3 years ago

Over time, the FileStore is subject to an OutOfMemoryException when the internal dictionary holding offsets exceeds the max allowable keys for a dictionary. While the exception is "OutOfMemory", it's not actually an out of memory condition. It's actually that the resize event attempts to overflow the max allowed size for an array on the internal generic dictionary resize. This may be caused by sequence numbers not resetting and overflowing the max dictionary size. For example, if the ATS doesn't reset to 1, but continues each day at an ever increasing number, even if messages aren't stored, the overflow can happen.

You can see this exception by doing the following: var dictionary = new Dictionary<string, string>(int.MaxValue);

There are several approaches to fix this:

  1. Use a dictionary of dictionaries keyed by a mod value. For example, to shard out the data over 10 dictionaries, something like this could be used: var headerId = Convert.ToInt32(headerParts[0]); offsets_[headerId % 10][headerId] = new MsgDef(Convert.ToInt64(headerParts[1]), Convert.ToInt32(headerParts[2]))
  2. Use some other file memory storage or build a custom memory storage that doesn't have the limits imposed by dictionary.
  3. Don't use a dictionary at all. This doesn't seem to be used in a critical performance section, so you could just put all of the metadata needed inside of the MsgFile itself and then scan to handle cases where replay of messages is required.

Stack Trace: System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown. at System.Collections.Generic.Dictionary2.Resize(Int32 newSize, Boolean forceNewHashCodes) at System.Collections.Generic.Dictionary2.TryInsert(TKey key, TValue value, InsertionBehavior behavior) at System.Collections.Generic.Dictionary2.set_Item(TKey key, TValue value) at QuickFix.FileStore.ConstructFromFileCache() at QuickFix.FileStore.open() at QuickFix.FileStore..ctor(String path, SessionID sessionID) at QuickFix.FileStoreFactory.Create(SessionID sessionID) at QuickFix.Session..ctor(Boolean isInitiator, IApplication app, IMessageStoreFactory storeFactory, SessionID sessID, DataDictionaryProvider dataDictProvider, SessionSchedule sessionSchedule, Int32 heartBtInt, ILogFactory logFactory, IMessageFactory msgFactory, String senderDefaultApplVerID) at QuickFix.SessionFactory.Create(SessionID sessionID, Dictionary settings) at QuickFix.AbstractInitiator.CreateSession(SessionID sessionID, Dictionary dict) at QuickFix.AbstractInitiator.Start() at Obfuscated`

rmay-tws commented 3 years ago

More details on this. Here's an example of the seqnums file content (note, file names have been obfuscated):

/mnt/quickfix-store/AFixSession $ cat FIX.4.4-****-****.seqnums
0001395264 : 0001132834

As you can see, the range is actually only 262,430. However, because of the implementation, a dictionary is created with 1,132,834 entries, which causes the exception.

Specifically, in file store, like 135:

                            offsets_[Convert.ToInt32(headerParts[0])] = new MsgDef(
                                Convert.ToInt64(headerParts[1]), Convert.ToInt32(headerParts[2]));
oscar54321 commented 3 years ago

Have you resolved this issue? headerParts is read from .header file, that contains a comma separated list, not from .seqnums.