no2chem / bigint-buffer

💪🔢 bigint-buffer: Buffer Utilities for TC39 BigInt Proposal
Apache License 2.0
54 stars 11 forks source link

mocha failure #21

Open krasic opened 4 years ago

krasic commented 4 years ago

Hi. I am trying to change parquetjs to support 64 bit integers using bigint-buffer. The change seems straightforward and works in client code I have. However, I want to update parquetjs tests to reflect my change, and when I try to run their test I see this failure:

`$ npm test

parquetjs@0.8.0 test /home/***/parquetjs mocha

ParquetCodec::PLAIN ✓ should encode BOOLEAN values ✓ should decode BOOLEAN values ✓ should encode INT32 values ✓ should decode INT32 values node: ../src/bigint-buffer.c:132: fromBigInt: Assertion status == napi_ok' failed. Aborted (core dumped) I gather this is something related to native modules, but I'm a n00b in that regard. Any ideas?

no2chem commented 4 years ago

Thanks for reporting this. Do you have a fork with the changes you've made? I could take a look into it.

krasic commented 4 years ago

PR here: Use bigint for int64

no2chem commented 4 years ago

Hm, I see the issue.

Your call to toBufferLE passes a number instead of a bigint: https://github.com/ZJONSSON/parquetjs/pull/37/files#diff-b67720d607909e3cae702d746c5bf6f0R53

bigint-buffer doesn't know how to handle numbers so it errors out. moreover, you're passing an array of numbers, so it really doesn't know how to handle this.

There's two fixes needed for this:

one is that we need to handle numbers as well as bigints. second, is that we need to be able to write into a buffer you've allocated. This would eliminate the need for copy and allow you to iterate over the entire array, resulting in multiple n-api transitions.

I'd be happy to look into this, though I'm not sure if I can give you an ETA.

no2chem commented 4 years ago

@krasic - also, I'm interested in what you're trying to do with parquet. if you're looking for performance, it would seem that implementing native bindings would be useful, though I don't know if that's compatible with parquetjs's goals.

no2chem commented 4 years ago

@krasic sorry for the spam, but one more thing: it looks like node 12 has now implemented Buffer.writeBigInt64BE|LE, which probably implements all the functionality you need. https://nodejs.org/api/buffer.html#buffer_buf_writebigint64be_value_offset

It does seem like the underlying implementation is in javascript though, https://github.com/nodejs/node/blob/e71a0f4d5faa4ad77887fbb3fff0ddb7bca6942e/lib/internal/buffer.js#L590

So it'll be interesting to benchmark to see what the performance difference is.

krasic commented 4 years ago

On Mon, Jan 6, 2020 at 8:17 PM Michael Wei notifications@github.com wrote:

@krasic https://github.com/krasic - also, I'm interested in what you're trying to do with parquet. if you're looking for performance, it would seem that implementing native bindings would be useful, though I don't know if that's compatible with parquet.

The issue I am trying to address is that parquetjs fails reading some files I have that are generated by an apache pig based the job, because those files contain 64 bit integer values. I believe incomplete 64 bit integer support is a known issue with parquetjs.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/no2chem/bigint-buffer/issues/21?email_source=notifications&email_token=ABRABBU3R6KMXJYNREW7I53Q4P66ZA5CNFSM4KDI3FT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIHUKTQ#issuecomment-571426126, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRABBT3UCWX5GKOLF7LE2DQ4P66ZANCNFSM4KDI3FTQ .

krasic commented 4 years ago

On Mon, Jan 6, 2020 at 8:23 PM Michael Wei notifications@github.com wrote:

@krasic https://github.com/krasic sorry for the spam, but one more thing: it looks like node 12 has now implemented Buffer.writeBigInt64BE|LE, which probably implements all the functionality you need. https://nodejs.org/api/buffer.html#buffer_buf_writebigint64be_value_offset

It does seem like the underlying implementation is in javascript though,

https://github.com/nodejs/node/blob/e71a0f4d5faa4ad77887fbb3fff0ddb7bca6942e/lib/internal/buffer.js#L590

So it'll be interesting to benchmark to see what the performance difference is.

Ah. I think that code may be the most promising solution for me. My use case isn't performance critical.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/no2chem/bigint-buffer/issues/21?email_source=notifications&email_token=ABRABBWAQG4ZXYYYETC5OKTQ4P7SPA5CNFSM4KDI3FT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIHUS4A#issuecomment-571427184, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRABBS6QOCA66LFPB6FAILQ4P7SPANCNFSM4KDI3FTQ .