luizperes / simdjson_nodejs

Node.js bindings for the simdjson project: "Parsing gigabytes of JSON per second"
https://arxiv.org/abs/1902.08318
Apache License 2.0
549 stars 25 forks source link

Error while starting #23

Closed dalisoft closed 4 years ago

dalisoft commented 4 years ago

Hi @luizperes and thanks for such great project.

I'm tried run this project, but there got some errors.

nodemon

[nodemon] app crashed - waiting for file changes before starting...
[nodemon] restarting due to changes...
[nodemon] starting `node --enable-source-maps --trace-uncaught bench-json-str.js`
[nodemon] app crashed - waiting for file changes before starting...
^Z⏎                                                                                                                          
turbo-json on  master [!?] is 📦 v1.0.0 via ⬢ v13.11.0 took 1m24s 

node + fish shell

✦ ❯ node bench-json-str.js
fish: Job 2, 'node bench-json-str.js' terminated by signal SIGILL (Illegal instruction)
turbo-json on  master [!?] is 📦 v1.0.0 via ⬢ v13.11.0 

node + zsh shell

✦ ❯ zsh
dalisoft@dalisofts-MBP turbo-json % node bench-json-str.js 
zsh: illegal hardware instruction  node bench-json-str.js

node + bash

dalisoft@dalisofts-MBP turbo-json % bash

The default interactive shell is now zsh.
To update your account to use zsh, please run `chsh -s /bin/zsh`.
For more details, please visit https://support.apple.com/kb/HT208050.
bash-3.2$ node bench-json-str.js 
Illegal instruction: 4

instructions

bash-3.2$ sysctl -a | grep machdep.cpu.leaf7_features
machdep.cpu.leaf7_features: RDWRFSGS SMEP ERMS MDCLEAR IBRS STIBP L1DF SSBD

Env

luizperes commented 4 years ago

Hi @dalisoft, it seems like your machine doesn't have support to SIMD. When I check in my machine, I get something like:

> sysctl -a | grep machdep.cpu.leaf7_features
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SGX BMI1 HLE AVX2 SMEP BMI2 
ERMS INVPCID RTM FPU_CSDS MPX RDSEED ADX SMAP CLFSOPT IPT MDCLEAR TSXFA IBRS 
STIBP L1DF SSBD

I think simdjson works best if you have SSE4.2 or AVX2 (it looks like your hardware doesn't include those?). I believe @lemire and the others are working on the AVX512 as far as I remember by glancing at some emails.

As for the crash, I think my bindings are not working 100%, since the expected behaviour is handling it gracefully rather than crashing. It should have compiled with the nosimd option, and because it's generating illegal instructions, I believe there is an edge case that I have to look. (again, it will fix the crash, but it seems like it won't use vector extensions, and I don't know if that will be any better than the default parser). @lemire might have a different opinion about that.

Thanks for opening this issue, I really appreciate the contribution!

dalisoft commented 4 years ago

Hi @luizperes

Yes, i think my machine doesn't support AVX2 and/or SSE4, but i expected falling back to JSON.parse as mentioned in README (maybe i'm wrong).

Thank you too for such great library, i want use this library on my library for improving JSON.parse performance, i hope you can fix handling and we can use this library.

I currently using turbo-json-parse, but library seems doesn't stable. If you update library, i can test library and also give benchmark results too how much performance differs.

luizperes commented 4 years ago

Hi @dalisoft,

Yes, you're completely right. That is exactly how it should work.

Thank you too for being enthusiastic in using it! Truth being told, I really only made it available for node, the hard work is done at the original repo :)

I already have an idea of how the crash is happening. If I were to guess, it is because of my config file, more precisely around here. I also discovered that there is an issue open for it (#3).

I don't have much time to work on it these next couple weeks (doing my masters final reports, reviewing papers and such). But I can totally take a look around mid-April. I will let you know once I fix it and that would also be nice if you could validate it (since we should test in a config like yours). Meanwhile, I don't know if you are any comfortable with node-gyp, but if you are and could help, I would really appreciate it too!

Thanks again!

luizperes commented 4 years ago

Hi @dalisoft, by reading #3 I realized that it shouldn't be hard at all! Am just waiting for the confirmation there from Lemire, and might be able to do it even tomorrow if it is what I am thinking!

dalisoft commented 4 years ago

@luizperes I tried node-gyp, but i not able to understand good to use. I have also other few projects which if i bind from c/c++ will be faster, but for now i can't. In future may be know.

Yes, hard work made by simdjson community/team, but for node.js community hard work made by you and community of simdjson-node did, so from node.js community all thanks goes to you.

About your time, yep, i understand, i also does not have time, while searching i found library and i choose your library from one of 3-4 libraries because your library does what i need and works stable yet (exclude current issue). I sure, after some times this project will get popularity and this library will become enough popular.

luizperes commented 4 years ago

Right!

Checking #3, Lemire said that simdjson would have a runtime dispatch (which we have now), so I think I won't need to touch node-gyp. Yet, I will need to be the one doing it, because I will have to update the library and change its structure.

Screen Shot 2020-03-30 at 2 49 44 AM

Will let you know once I do it! Cheers!

lemire commented 4 years ago

It is correct, the upcoming version 0.3 of simdjson will run basically everywhere.

I expect we will be releasing it in a few days.

cc @jkeiser

luizperes commented 4 years ago

Sounds good @lemire. I will be waiting for that then! I created a new issue to address that #24.

cc @dalisoft

luizperes commented 4 years ago

Hi @dalisoft, I updated the bindings to the latest version of simdjson. I think it should work now. Let me know how it goes. Cheers!

dalisoft commented 4 years ago

Hi @luizperes.

Yes, it works, but very slow compared to native JSON.parse on machines where AVX does not support.

Example of benchmark code:

bench.js

const turboJson = require("turbo-json-parse");
const simdJson = require("simdjson");

const bench = (name, fn) => {
  console.time(name);
  for (let i = 0; i < 200000; i++) {
    fn();
  }
  console.timeEnd(name);
};

const SCHEMA = {
  type: "object",
  properties: {
    foo: { type: "string" }
  }
};

const JSON_BUFF = `{
  "foo": "bar"
}`;

const TURBO_COMPILE = turboJson(SCHEMA, {
  fullMatch: true,
  validate: false,
  defaults: false,
  buffer: false
});

bench("json.parse", () => JSON.parse(JSON_BUFF));
bench("simdjson", () => simdJson.parse(JSON_BUFF));
bench("turbo.parse", () => TURBO_COMPILE(JSON_BUFF));

calling

$ node bench.js

results

json.parse: 190.99ms
simdjson: 829.617ms
turbo.parse: 81.068ms

~4-5 times slower, currently did not tested on AVX machines, i think in few days can test in AVX devices too

luizperes commented 4 years ago

Hi @dalisoft, this problem is described on issue #5. Also, please read the Documentation.md.

If you want to get its full power, use the function lazyParse.

Because your initial issue is fixed, I will close it. Feel free to open new issues in case you find any, cheers!

dalisoft commented 4 years ago

@luizperes Yes, current issue was fixed. Thanks. Great project and nice maintainer as you fixed issue in less than week :)