MichaelQQ / dbfstream

dbf file parser (stream version)
2 stars 2 forks source link

Not reading Integer column correctly #11

Open LAPSrj opened 4 years ago

LAPSrj commented 4 years ago

Hi!

I'm trying to read a file that has some Integer columns.

I saw that for this type of column the raw data is returned, so I parsed it using readInt32LE(). At first most of the data in those columns were read incorrectly, then I found the problem was the "replace" to remove white space, so I changed it to check if the column type is "I" and only remove if it isn't.

Doing so most of the data started reading correctly, but in a table with ~9500 records 25 aren't reading correctly. In those it is reading more than 4 bytes for that column. I couldn't find any pattern for this behavior and the other columns in those rows are read correctly.

What information would help troubleshoot this problem?

Thanks in advance.

MichaelQQ commented 4 years ago

Hi @LAPSrj,

Could you provide some sample files which have integer columns and would cause the incorrect results as you said. Thank you.

LAPSrj commented 4 years ago

Hi @MichaelQQ,

I ended up found what was causing it and a workaround to it. Iconv was adding bytes to those columns (It happened with all the encoding I tried).

What I did was check the column type and if it's "I" I would assign just the bytes, without passing thru iconv: if(now.type != "I"){ var value = iconv .decode(data.slice(acc, acc + now.length), encoding) .replace(/^\s+|\s+$/g, ''); }else{ var value = data.slice(acc, acc + now.length); }

On the dataTypes object I also included one for the I type: I(data) { return Buffer.from(data).readInt32LE(); }

If you want I can make a pull request with those changes.

MichaelQQ commented 4 years ago

Hi @LAPSrj,

If you could make a pull request, that would be great. Thank you for sharing the workaround.