Sable / dls18-ostrich

Experimental results for the second version of ostrich in the context of ubiquitous Javascript
3 stars 2 forks source link

Inside: crc in Firefox on Samsung tablet #4

Open wukefe opened 6 years ago

wukefe commented 6 years ago

Objectives

Try to explain why crc in Firefox on Samsung tablet is so slow.

The core computation of crc (Link to source code)

    crc = crc32Lookup[7 * 256 + (one & 0xFF)] ^
      crc32Lookup[6 * 256 + ((one >>> 8) & 0xFF)] ^
      crc32Lookup[5 * 256 + ((one >>> 16) & 0xFF)] ^
      crc32Lookup[4 * 256 + (one >>> 24)] ^
      crc32Lookup[3 * 256 + (two & 0xFF)] ^
      crc32Lookup[2 * 256 + ((two >>> 8) & 0xFF)] ^
      crc32Lookup[1 * 256 + ((two >>> 16) & 0xFF)] ^
      crc32Lookup[two >>> 24]

Two versions of crc

Note: crc32Lookup is a fixed-length table storing hex numbers.

Summary

Performance: original crc

Platform Chrome (s) Firefox (s) Ratio
Laptop 2.30 3.46 1.50x
Tablet 5.30 51.66 9.75x

Performance: modified crc

Platform Chrome (s) Firefox (s) Ratio
Laptop 2.33 1.81 0.78x
Tablet 5.97 6.50 1.09x

Details of the experiments can be found in the following sections.

Observations and explanations

Original crc

Warm-up: Experiments on Laptop (MacOS)

Setup

Results

| Platform | Chrome  | Firefox |
| -------- | ------- | ------- |
| Mean (s) | 2.30    | 3.46    |  <- ratio: 1.50x

Verification

Experiments on Samsung Tablet (Android)

Setup

Results

| Platform | Chrome  | Firefox |
| -------- | ------- | ------- |
| Mean (s) |  5.30   |  51.66  | <- ratio: 9.75x

Modified crc

The same configurations as original crc has.

Warm-up: Experiments on Laptop (MacOS)

Results

| Platform | Chrome  | Firefox |
| -------- | ------- | ------- |
| Mean (s) |  2.33   |  1.81   | <- ratio: 0.78  //Firefox is faster!

Experiments on Samsung Tablet (Android)

Results

| Platform | Chrome  | Firefox |
| -------- | ------- | ------- |
| Mean (s) |  5.97   |  6.50   | <- ratio: 1.09
wukefe commented 6 years ago

Validation with a simple JS program

Overview

Step 0: define computation function testInt

function testInt(input){
    var value = []
    for(var i=0; i<input.length; i++){
        var one = input[i]
        var bytes = []
        bytes.push(one & 0xFF)
        bytes.push((one >>> 8) & 0xFF)
        bytes.push((one >>> 16) & 0xFF)
        bytes.push((one >>> 24) & 0xFF)
        value.push(bytes)
    }
    return value
}

Step 1: define verifier checkInt

function checkInt(a, b){
  if(a.length != b.length) return false;
  for(var i in a.length){
    var ai = a[i], bi = b[i];
    if(ai instanceof Array && bi instanceof Array){
      return checkInt(ai, bi);
    }
    else if(ai != bi) return false;
  }
  return true;
}

Step 2: provide input data

var data = [0x2C8E0FFF, 0xE0240F61, 0x6EAB0882, 0xA201081C]
var inputU32 = new Uint32Array(data)
var inputI32 = new Uint32Array(data)

var resultU32 = testInt(inputU32);
var resultI32 = testInt(inputI32);

console.log("input data is " + data);

Step 3: output 'The same!' if testInt returns the same result with different types of input

if (checkInt(resultU32, resultI32)){
  console.log("The same!")
}
else {
  console.log(resultU32)
  console.log(resultI32)
  console.log("Sorry, not the same!")
}

Final Result

'The same!'
wukefe commented 6 years ago

Further Explanation

With bit-wise operations, the signed or unsigned type is no difference between them since all 32 bits are treated individually.

For example, 0xA201081C in bits as follows:

1010 0010 0000 0001 0000 1000 0001 1100
--------- --------- --------- ---------
1st        2nd        3rd        4th
wukefe commented 6 years ago

crc in Steps

crc = crc32Lookup[7 * 256]                          // step 0: baseline

crc = crc32Lookup[7 * 256 + (one & 0xFF)]           // step 1: add bitwise and '&'

crc = crc32Lookup[6 * 256 + ((one >>> 8) & 0xFF)]   // step 2: add right shift '>>>'

crc = crc32Lookup[7 * 256 + (one & 0xFF)] ^         // step 3: merge step 1 & 2
      crc32Lookup[6 * 256 + ((one >>> 8) & 0xFF)]

Experimental results

Steps Original crc Modified crc Comment
0 1.51s 1.49s baseline
1 10.94s 3.61s add bitwise and '&'
2 10.74s 3.72s add right shift '>>>'
3 19.37s 4.34s merge step 1 and 2