abstractvector / node-dbf

An efficient dBase DBF file parser written in pure JavaScript
MIT License
54 stars 67 forks source link

Read only the necessary bytes when parsing the header #36

Open DavidBruant opened 5 years ago

DavidBruant commented 5 years ago

I use this library in a project i'm currently working on. It works beautifully, thank you very much!

The files i use it on are around 1.5GB. The streaming of the body is super cool, however, i noticed that to parse the header, the entire file was being read in memory

This is due to the use of fs.readFile in src/header.js\ In this PR, the code is a bit lower-level to be able to read only the necessary bytes. The idea is: 1) read on the file only the start field of the header, 2) this information is used to know how many bytes the header is composed of, so only this number of bytes is read

I'm happy to discuss the change further if what i did here is unclear or if it doesn't adhere to the project standard

DavidBruant commented 5 years ago

I can confirm that this change had an amazing perf impact on the header reading in my 1.5GB files. It's pretty much instantaneous now and i can run my script in parallel of my web browser without running out of memory :ok_hand: