Open rossabaker opened 2 years ago
I think there are a few angles:
String
(e.g. CharSequence
or even a generic type). I think we can, but I assume this may cost in performance (due to loss of inlining) and possibly break compatibilityI could imagine a way to be mostly if not entirely source code compatible by doing something like having
abstract class ParserModule {
type Input
protected def ... // all the core Input related code here
// all current implementation here
class Parser0[A] {
....
}
class Parser[A] extends Parser0[A] {
...
}
object Parser {
...
}
}
package cats
object parse extends ParserModule {
type Input = String
def ... // implement some string specific code here
}
then you can do:
package cats.parse
object charseq extends ParserModule[CharSequence] {
...
}
If we did this, and could get the tests to compile without any changes and performance within a few percent, I think it would be worth publishing a new version 0.4.x
that breaks binary compatibility.
Has any thought been given to abstracting over the input type? I'm thinking specifically of binary inputs like
Array[Byte]
,fs2.Chunk
,scodec.bits.ByteVector
, orjava.nio.ByteBuffer
. I'm struggling to compete with an HTTP/1 parser that works onArray[Byte]
.The obvious answer is scodec. The old fs2-http parser built on it, while beautiful, is also much slower. I'm dreaming of a scodec with the cats-parse mutability trick.
I spiked on it a bit. Problems I encountered:
String
andChar
based parsers when the underlying type is binary. If we added binary parsers, we'd have to do unspeakable things when the underlying type is characters.BitVector
, areLong
-indexed instead ofInt
. This ripples at least intoState
andExpectation
.BinParser
andBinParser0
, but the duplication is an awful shame. It might not even be the same library anymore.A more modest abstraction is to accept
CharSequence
as input, at which point we can wrap binary inputs with something like Netty'sAsciiString
. It's still abusive with respect toChar
vs.Byte
. It also doesn't help with HTTP/2, where we might benefit from aBitVector
.This is probably all a terrible idea, but I thought I'd ask.