shakshin / isoparser

ISO 8583 file parser
11 stars 5 forks source link

How to process a huge file of 1 GB #3

Closed kbjan26 closed 2 years ago

kbjan26 commented 2 years ago

Hello There,

What is the best way to parse a huge sized file ? The code takes the stream of file as a whole and then sequentially reads them.

Is there any alternate way to tweak it ?

shakshin commented 2 years ago

Hi! It depends on what exact do you need. Currently application will parse the whole file and create java objects fore each message in the RAM. You can run out of memory on huge files. If you want just to find messages according to any condition, you can modify the code and check that condition before adding next message to the list. So unmatched messages will be cleared out by garbage collector. This could reduce memory usage.

kbjan26 commented 2 years ago

To be precise , are you referring to add condition in the below snippet ? Will that resolve the heap outage issue ?

while (true) { Trace.log("IsoFile", "### Parsing message number " + (messages.size() + 1)); IsoMessage msg = IsoMessage.read(cfg, in); if (msg == null) break; msg.number = messages.size() + 1; messages.add(msg); }

Or is there a way that you can suggest to read the file line by line to parse the messages and add them ?

shakshin commented 2 years ago

Yes, you've found the right place in the code. So now you have two ways:

  1. You can filter out unneeded messages adding "if" condition before messages.add invocation. This will make application work in usual way but ignore filtered messages. And this will print only messages which was not filtered by condition.
  2. Or you can replace messages.add with System.out.println(msg.toString()). This will cause application to print parsed message without keeping it in the heap. However in this case app will not be able to perform any analysis at the end.
kbjan26 commented 2 years ago

Closing it as per suggestion from Shakshin. Thank you for always being prompt on the responses.

For large volume of data , having trace of LinkedList won't work. For everyone's benefit mentioning it here and it's not an issue of this parser itself. It's to do with the data one will handle