lemire / simdcomp

A simple C library for compressing lists of integers using binary packing
BSD 3-Clause "New" or "Revised" License
488 stars 53 forks source link

Javascript version #13

Open wshager opened 8 years ago

wshager commented 8 years ago

It would be interesting to see if a javascript version of this library could be created using SIMD.js.

https://developer.mozilla.org/nl/docs/Web/JavaScript/Reference/Global_Objects/SIMD https://github.com/tc39/ecmascript_simd https://github.com/jonbrennecke/node-simd

lemire commented 8 years ago

Yes, this looks possible now.

wshager commented 8 years ago

The Emscripten compilation of this lib also look promising, but I don't know what a dynamic call from javascript would look like. Emscripten generates a lot of constants and heap allocations.

lemire commented 8 years ago

@wshager

I am somewhat pessimistic regarding any approach that does not translate into SIMD instructions at the machine level.

wshager commented 8 years ago

Please note that SIMD is currently in the crosswalk V8 fork: https://github.com/crosswalk-project/v8-crosswalk

wshager commented 8 years ago

Hmm perhaps you are right, nevermind. I can't find it in the repo. It may be too premature, except maybe for Firefox Nightly.

wshager commented 8 years ago

Continuing from #10, the performance of the first example compiled to javascript with Emscripten performs well, but performance of other 2 tests is extremely poor. It seems SIMD.js is nowhere near being compatible with this project.

lemire commented 8 years ago

@wshager

Can you tell us which examples fare well and which ones fare badly?

wshager commented 8 years ago

When I compiled previously with my smmintrin shim the first two test would run, but simple_demo would hang, currently the second example didn't finish in any acceptable time, only the first compress_decompress example finishes in ff nightly.

lemire commented 8 years ago

What happens after this last commit https://github.com/lemire/simdcomp/commit/b7e1ecca96ebb5290c52e3738fb844e2436ffa74

?

wshager commented 8 years ago

@lemire yes, that script does finish, in about 8 seconds in FF nightly on my i7-3612QM (8GB).

lemire commented 8 years ago

Ok. Can you share the resulting output (just so we can document it).

wshager commented 8 years ago

@lemire sorry, I did, but I replied to the commit. Here goes: http://lagua.nl/uploadfiles/example.html http://lagua.nl/uploadfiles/example.uncompressed.html

wshager commented 8 years ago

example.js.txt example.uncompressed.js.txt

lemire commented 8 years ago

@wshager

Thanks, but I am inviting you to publish the performance results you got.

My concern right now is whether SIMD instructions are used at all. On my laptop, the script completes, but very slowly and with speeds that indicate that SIMD instructions are not used.

wshager commented 8 years ago

_compress_decompress_demo: 10-50 ms _varying_bit_width_demo: 0-10 ms _simple_demo: 4600-7400 ms

lemire commented 8 years ago

@wshager Do you have the screen output where it compares the memcpy speed with the decoding speed?

On my Firefox, things just freeze, so it is no use. On Google Chrome, I do not even get 1 million int. per second.

wshager commented 8 years ago

Ah sorry I misunderstood. Here it is:

== simple test Code works! == varying bit-width demo encoded size: 242 (original size: 1024) Code works! == simple demo

gap = 1 compression ratio = 30.117647 decoding speed in million of integers per second 1.159420 memcpy speed in million of integers per second 320.000000 ignore me -589677036 All tests are in CPU cache. Avoid out-of-cache decoding in applications.

gap = 3 compression ratio = 15.515152 decoding speed in million of integers per second 1.154193 memcpy speed in million of integers per second 640.000000 ignore me -1852397360 All tests are in CPU cache. Avoid out-of-cache decoding in applications.

gap = 9 compression ratio = 7.876923 decoding speed in million of integers per second 1.176471 memcpy speed in million of integers per second 640.000000 ignore me -1604405666 All tests are in CPU cache. Avoid out-of-cache decoding in applications.

gap = 27 compression ratio = 6.320988 decoding speed in million of integers per second 1.118881 memcpy speed in million of integers per second 640.000000 ignore me -848559680 All tests are in CPU cache. Avoid out-of-cache decoding in applications.

gap = 81 compression ratio = 4.530973 decoding speed in million of integers per second 1.072925 memcpy speed in million of integers per second 213.333333 ignore me -893214422 All tests are in CPU cache. Avoid out-of-cache decoding in applications.

gap = 243 compression ratio = 3.968992 decoding speed in million of integers per second 1.076535 memcpy speed in million of integers per second 640.000000 ignore me -1238753396 All tests are in CPU cache. Avoid out-of-cache decoding in applications.

wshager commented 8 years ago

@lemire did you install nightly? https://nightly.mozilla.org

lemire commented 8 years ago

@wshager

These results are better than what I get with Google Chrome but still not close to what one would want. These results are thousands of times slower than C. That's not surprising given that it is JavaScript... but still...

wshager commented 8 years ago

@lemire agreed, too bad. Perhaps the generated js isn't optimal.

lemire commented 8 years ago

@wshager Yes, manual inspection of the generated code might be telling.

wshager commented 8 years ago

@lemire the abstractions in SIMD.js may be clearer, but the downside is that there's no way to convert the code back to pure functions... unless I take on the quixotic task of rewriting SIMD.js to pure functions.

wshager commented 8 years ago

Currently the function mapping is as follows: var SIMD_Int8x16_add=SIMD_Int8x16.add; var SIMD_Int8x16_sub=SIMD_Int8x16.sub; var SIMD_Int8x16_neg=SIMD_Int8x16.neg; var SIMD_Int8x16_mul=SIMD_Int8x16.mul; var SIMD_Int8x16_equal=SIMD_Int8x16.equal; var SIMD_Int8x16_lessThan=SIMD_Int8x16.lessThan; var SIMD_Int8x16_greaterThan=SIMD_Int8x16.greaterThan; etc.

Anyway, the generated code is quite unreadable, because each transformation is assigned to a variable (which may also impede performance).

lemire commented 8 years ago

No matter. I think it is useful and important to experiment, even when the results are underwhelming.

wshager commented 8 years ago

I've created an interface from (client-side) javascript to the simdcomp library using Emscripten. What I fail to understand is how to decompress a Uint8Array.

var maxbits_length = Module.cwrap('maxbits_length', 'number', ['number','number']);
var simdpack_length = Module.cwrap('simdpack_length', 'number', ['number','number','number','number']);
var simdunpack_length = Module.cwrap('simdunpack_length', 'number', ['number','number','number','number']);

var n = 128;

var data = new Uint32Array(128);

for (var k = 0; k < n; k++){
    data[k] = k+3;
}
//var n = data.length;
// Get data byte size, allocate memory on Emscripten heap, and get pointer
var nDataBytes = n * data.BYTES_PER_ELEMENT;
var dataPtr = Module._malloc(nDataBytes);

// Copy data to Emscripten heap
var dataHeap = new Uint8Array(Module.HEAPU8.buffer, dataPtr, nDataBytes);
dataHeap.set( new Uint8Array(data.buffer) );

// Call the function by passing a number pointing to the byte location of
// the array of pointers on the Emscripten heap.  Emscripten knows what to do!
var b = maxbits_length(dataHeap.byteOffset, n);

var nOutBytes = nDataBytes + n / 128;
var outPtr = Module._malloc(nOutBytes);

simdpack_length(dataHeap.byteOffset, n ,outPtr,b);

var outHeap = new Uint8Array(Module.HEAPU8.buffer, outPtr, nOutBytes);

//var out = new Uint8Array(outHeap.buffer, outHeap.byteOffset, n + n);

var backPtr = Module._malloc(nDataBytes);

simdunpack_length(outHeap.byteOffset, n , backPtr, b);

var backHeap = new Uint8Array(Module.HEAPU8.buffer, backPtr, nDataBytes);

var back = new Uint32Array(backHeap.buffer, backHeap.byteOffset, n);

console.log(back);

// Free memory
Module._free(dataHeap.byteOffset);
Module._free(outHeap.byteOffset);
Module._free(backHeap.byteOffset);
lemire commented 8 years ago

@wshager I am not very familiar with Emscripten.

Do you know how to achieve what you are trying to achieve in C? If you do then, presumably, it is simply a matter of translating the appropriate C code.

wshager commented 8 years ago

@lemire following the updated example code, the serialized compressed array now successfully decompresses. What I was wondering is that in client-side javascript, providing a license is somewhat complicated, as the code is compressed obfuscated. How would you like me to ameliorate this?

lemire commented 8 years ago

@wshager

This C code falls under the BSD License. If you want to repackage it as JavaScript and redistribute it, you can use BSD as well.

lemire commented 8 years ago

@wshager

Though it is not a port of this library per se, the following JavaScript library might be of interest...

https://github.com/lemire/FastIntegerCompression.js

wshager commented 8 years ago

@lemire great! I'll check it out, it will no doubt be very useful until we have full SIMD support.

skibulk commented 5 years ago

example.uncompressed.js.txt

3 years later and all of the examples return "speed in million of integers per second 0.184491" or less in Chrome on my system!

I added the following code inside the _simple_demo function at line 7536. Any idea why _simdpackwithoutmask returns undefined?

var data = [29,70,8,3,4,1,22,3,4,5,2,1,5,1,1,7,1,2,1,1,2,3,1,4,8,1,10,2,1,3,4,44,54,1,3,3,3,2,6,3,2,1,1,1,8,1,2,4,11,1,2,9,3,4,5,1,6,6,6,2,10,55,35,1,8,2,2,3,4,1,1,3,4,5,7,3,1,2,4,2,2,5,2,9,1,2,1,3,3,9,0,44,61,3,4,11,1,1,3,3,2,5,5,1,5,2,1,2,4,4,1,3,5,1,4,4,4,5,1,9,51,32,19,1,3,5,1,6,1,1,2,1,8,11,6,2,3,2,3,1,3,2,3,3,3,3,6,6,6,3];
console.log(_simdpackwithoutmask(data, new ArrayBuffer(8), _maxbits(data)));