apache / datasketches-memory

High performance native memory access for Java.
https://datasketches.apache.org
Apache License 2.0
114 stars 27 forks source link

Method to check if two Memories have same bytes in range #36

Closed leventov closed 6 years ago

leventov commented 6 years ago

I cannot find a method to check that two ranges of bytes in two Memories are equal. compareTo() is inefficient because it compares data byte by byte.

leerho commented 6 years ago

What are you looking for? A method that uses longs, ints, and shorts automatically, based on byte alignment.

leventov commented 6 years ago

A-la ByteBuffer.equals(), but allowing to specify ranges, as well as current Memory.compareTo(). Byte order is ignored.

BTW implementing Memory.equals(Object) (overriding Object.equals()), in the same manner as ByteBuffer, is useful too.

leerho commented 6 years ago

compareTo() is inefficient because it compares data byte by byte.

Inefficient compared to what? The ByteBuffer equals() and compareTo() both check byte by byte. The only difference is the equals(Object) checks Object == and instanceOf before the doing the byte by byte.

Let me more clear:

I was thinking about a more sophisticated check by reading longs when possible rather than just bytes. Byte-order is irrelevant. For large regions this would ultimately be faster. For very short regions we could fall back to checking byte by byte.

For example, the core logic called by the public API would be something similar to the following:

  @Test
  public void checkEquals() {
    int len = 127;
    WritableMemory wmem1 = WritableMemory.allocate(len);
    WritableMemory wmem2 = WritableMemory.allocate(len);
    for (int i = 0; i < len; i++) {
      wmem1.putByte(i, (byte) i);
      wmem2.putByte(i, (byte) i);
    }

    assertTrue(equals(wmem1, 0, wmem2, 0, len));
    assertTrue(equals(wmem1, 1, wmem2, 1, len - 1));
  }

  static final boolean equals(Memory mem1, long mem1off, Memory mem2, long mem2off, long lenBytes) {
    if ((mem1 == mem2) && (mem1off == mem2off)) { return true; }
    if (lenBytes < 8) { //adjust this from characterization testing
      return equalsByBytes(mem1, mem1off, mem2, mem2off, lenBytes);
    }
    long longs = lenBytes >>> 3;
    for (long i = 0L; i < longs; i++) {
      if (mem1.getLong(mem1off + (i << 3)) == mem2.getLong(mem2off + (i << 3))) { continue; }
      else { return false; }
    }
    long longBytes = longs << 3;
    long remBytes = lenBytes - longBytes;
    return (remBytes == 0) ? true
        : equalsByBytes(mem1, mem1off + longBytes, mem2, mem2off + longBytes, remBytes);
  }

  private static final boolean equalsByBytes(Memory mem1, long mem1off, Memory mem2, long mem2off,
      long lenBytes) {
    for (long i = 0; i < lenBytes; i++) {
      if (mem1.getByte(mem1off + i) == mem2.getByte(mem2off + i)) { continue; }
      else { return false; }
    }
    return true;
  }
leventov commented 6 years ago

Yes, I meant inefficient because it consumes 1 bytes at a time instead of 8 bytes. In case of ByteBuffer, however, I'm almost sure that this method is vectorized.

One note about the code: as a rule of thump please avoid long-indexed loops in the code that has potential to be vectorized (or at least unrolled), because JVM's safepoint polls kill that. See https://github.com/DataSketches/memory/pull/16/commits/f642340cee6b9aec35563070a82708e906040dd3#diff-1bd6955f79e86fe612e643c0ff3fd7a9R81

leventov commented 6 years ago

@leerho you may find this interesting: https://bugs.openjdk.java.net/browse/JDK-8193085. What I assumed is already implemented in OpenJDK, actually targets JDK 11.

leerho commented 6 years ago

Impressive work. However, all these support methods seem very attached to a limited view of reading/writing/comparing of buffer regions that are integer multiples of the specified type from the start of the buffer. For example this little snippet:

    public static int mismatch(long[] a, int aFromIndex,
                               long[] b, int bFromIndex,
                               int length) {
        if (length == 0) {
            return -1;
        }
        int aOffset = Unsafe.ARRAY_LONG_BASE_OFFSET + (aFromIndex << LOG2_ARRAY_LONG_INDEX_SCALE);
        int bOffset = Unsafe.ARRAY_LONG_BASE_OFFSET + (bFromIndex << LOG2_ARRAY_LONG_INDEX_SCALE);
        int i = vectorizedMismatch(
                a, aOffset,
                b, bOffset,
                length, LOG2_ARRAY_LONG_INDEX_SCALE);
        return i >= 0 ? i : -1;
    }

The real irony, is that apparently the intrinsic library vectorizeMismatch can accommodate non-aligned data, but the code above is rigidly tied to type-alignment to the start of the buffer.

If your data structure is {int, {long, ... , long}, int, {long, ... , long}, etc } you are S*** Out-Of-Luck!

What I would like to see is a major redesign of ByteBuffer to allow get/put of primitives and primitive arrays at arbitrary byte offsets. I don't think they need to change any of the current API, just add new capabilities.