sidorares / node-mysql2

:zap: fast mysqljs/mysql compatible mysql driver for node.js
https://sidorares.github.io/node-mysql2/
MIT License
4.05k stars 614 forks source link

WebAssembly based parser #335

Open sidorares opened 8 years ago

sidorares commented 8 years ago

Web assembly is already available ( behind flag ) in node 6, so might worth trying it as a next level of "Userspace JIT" approach

Some overview: https://ia601503.us.archive.org/32/items/vmss16/titzer.pdf

Working wasm examples ( run with node --expose-wasm):

function fib(stdlib, foreign, heap) {
  "use asm";

  var i32 = new stdlib.Int32Array(heap);
  var f64 = new stdlib.Float64Array(heap);
  var imul = stdlib.Math.imul;

  function fib(n) {
    n = n|0;
    if (n >>> 0 < 3) {
      return 1|0;
    }
    return (fib((n-1)|0) + fib((n-2)|0))|0;
  }

  return {
    fib:fib
  };
}

var m = _WASMEXP_.instantiateModuleFromAsm(fib.toString());
var asmFib = m.fib;

var n = 38;
console.time("ASM:fib(" + n + ")");
var f = asmFib(n, global);
console.log(f);
console.timeEnd("ASM:fib(" + n + ")");

binary wasm:

https://gist.github.com/sidorares/90607f73b499f2ccb7dd908a080ebe5d

Problems:

sidorares commented 8 years ago

very good read on generating wasm: https://github.com/zbjornson/human-asmjs

sidorares commented 7 years ago

update for node 6.x (tested with 6.9.1): _WASMEXP_ now became Wasm (not sure if that can be overriden)

sidorares commented 7 years ago

https://github.com/reklatsmasters/webassembly-examples

reklatsmasters commented 7 years ago

Thanks for the link to my examples! I think it's possible to use internal yacc-based sql parser sql/sql_yacc.yy. However, it's a quite difficult.

sidorares commented 7 years ago

@reklatsmasters thanks for your work! At the moment I think it's better to generate raw wasm on the fly rather than trying to compile c code

reklatsmasters commented 7 years ago

@sidorares Hm, interesting. You want to use current js-based parser compiled to wasm? What does "generate raw wasm on the fly" mean?

sidorares commented 7 years ago

no, a bit different

Mysql protocol looks like this:

client: "SELECT foo,bar from FOOBAR"
server: "OK!"
server: "This is what I have"
server: "foo:string"
server: "bar:int"
server: "now data:"
server: "foo value, bar value"
server: "foo value, bar value"
...
...
server: "foo value, bar value"
server: "done"

What I'm currently doing is when schema is known ( just before "now data" part ) JS function is generated that is optimised for deserealising data of only that particular shape from rod packets ( e.i read string, then read int ).

I want to try to implement that part ( "generate a deserialiser" function at runtime ) in wasm

Any help would be really appreciated!

reklatsmasters commented 6 years ago

I want to try to implement that part ( "generate a deserialiser" function at runtime ) in wasm

In this case you should call js functions from wasm for parse incoming message (call packet.parseEtc()). Sometimes it's a quite slow. I think, parser of an incoming message, generator of a deserialiser and deserialiser should be implemented in wasm. It may look like this:

// example in c

enum FIeldType {
    INT,
    FLOAT,
    // ...
}

struct Shema {
// ...
}

void parse_message(const char* buffer, int size, const Shema* shema) { /* ... */ }

const Shema* define_shema() { /* ... */ }

void append_field(const Shema* shema, FIeldType field) { /* ... */ }
sidorares commented 6 years ago

call packet.parseEtc()

All this parseXXX code also could be inlined to resulting wasm

Input: schema description, followed by stream of binary data matching schema

Output: wasm function that calls external JS function for each received row with all data deserealised, with minimum amount of extra work for JS to do. Since wasm does not have opjects or strings the closest we can have is ArrayBuffer containing JSON with result.

reklatsmasters commented 6 years ago

inlined to resulting wasm

If it meens 'implemented in wasm', i vote "yes".

that calls external JS function

It's necessary to minimise external calls. In the best case, remove them all.

wasm does not have objects or strings

We can interpret a part of wasm memory as a string:

Also, we can read null-terminated strings.

sidorares commented 6 years ago

Would be good to set initial benchmark as a first step - manually generate parser for some predefined schema and compare speed with js parser. Can you help with this @reklatsmasters ?