square / wire

gRPC and protocol buffers for Android, Kotlin, Swift and Java.
https://square.github.io/wire/
Apache License 2.0
4.26k stars 570 forks source link

support decoding bytes as a hex string somehow #958

Open codefromthecrypt opened 5 years ago

codefromthecrypt commented 5 years ago

While edge case, Zipkin's primary representation of bytes is hex (even if they are transmitted in proto as raw bytes). It helps performance wise to be able to parse into a hex string (likely other way, too)

ex

  // stolen from okio
  private val HEX_DIGITS =
      charArrayOf('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f')

  @Suppress("NOTHING_TO_INLINE")
  internal inline fun hex(data: BufferedSource, byteCount: Int): String {
    val result = CharArray(byteCount * 2)
    var i = 0
    while (i < result.size) {
      val b = data.readByte().toInt();
      result[i++] = HEX_DIGITS[b shr 4 and 0xf]
      result[i++] = HEX_DIGITS[b       and 0xf] // ktlint-disable no-multi-spaces
    }
    return String(result)
  }

  /**
   * Reads a `bytes` field value from the stream as a lower-hex string. The length is read from the
   * stream prior to the actual data.
   */
  @Throws(IOException::class)
  fun readBytesAsHex(): String {
    val byteCount = beforeLengthDelimitedScalar()
    source.require(byteCount) // Throws EOFException if insufficient bytes are available.
    return hex(source, byteCount.toInt())
  }
JakeWharton commented 5 years ago

You could write a Source and/or Sink that did this on-the-fly without Wire ever needing to know about it. Then it's just about wrapping your inputs/outputs as they make their way to Wire.

JakeWharton commented 5 years ago

This is basically what the Gzip and Deflate wrappers are doing: on-the-fly transcoding. Hex is, thankfully, a much simpler encoding so it should be much simpler to do.

codefromthecrypt commented 5 years ago

So I guess where I was at was I didn't want to interfere with the bookeeping going on with ProtoReader (it keeps track of how many bytes etc). Are you saying you would make a proto-aware source that looks at encoded fields to figure out which ones are bytes and should become hex?

codefromthecrypt commented 5 years ago

I'm referring to this which I couldn't find a sensible way to work around without invalidating the reader (doing stream parsing) https://github.com/square/wire/blob/master/wire-runtime/src/jvmMain/kotlin/com/squareup/wire/ProtoReader.kt#L371

JakeWharton commented 5 years ago

Oh so you want an individual field to do hex-to-byte conversion? Is it represented as a string or as bytes in the encoded form?

On Fri, May 10, 2019, 10:33 PM Adrian Cole notifications@github.com wrote:

I'm referring to this which I couldn't find a sensible way to work around without invalidating the reader (doing stream parsing) https://github.com/square/wire/blob/master/wire-runtime/src/jvmMain/kotlin/com/squareup/wire/ProtoReader.kt#L371

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/square/wire/issues/958#issuecomment-491472447, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQIEKK5MLJWUFCB2N66KLPUYV53ANCNFSM4HMHDT3A .

codefromthecrypt commented 5 years ago

yep. in proto it is bytes, in everywhere else (java type, correlation and json encoding) it is hex

codefromthecrypt commented 5 years ago

incidentally there's a slightly related problem in the same data type as IPs are in the proto as bytes, but everywhere else string literals. Ideally some callback to process the bytes one-by-one would be good, but at any rate having the hex thing for IDs improves a good measure.

JakeWharton commented 5 years ago

There's an undocumented concept called adapters which should be able to solve this. Perhaps Jesse can chime in since he wrote it. Otherwise I'll circle back soon.

On Fri, May 10, 2019, 10:44 PM Adrian Cole notifications@github.com wrote:

incidentally there's a slightly related problem in the same data type as IPs are in the proto as bytes, but everywhere else string literals. Ideally some callback to process the bytes one-by-one would be good, but at any rate having the hex thing for IDs improves a good measure.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/square/wire/issues/958#issuecomment-491473125, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQIEPC23ZPZSOR3BH7MHLPUYXIHANCNFSM4HMHDT3A .