Open idg10 opened 2 years ago
I think this would be a good thing to support and we could open an issue to do so.
At the moment, because System.Text.Json doesn't give us a way to get at the UTF8 bytes the first usecase I have in mind couldn't use them, though.
Ah yes, I keep forgetting that there isn't a straightforward way to get at the UTF-8, although you can get it to write the UTF-8 out into an IBufferWriter<byte>
. With buffer pooling that can be alloc-free per-iteration. (Doesn't avoid the copy of course, but I presume that direct access to the underlying buffer is deliberately not allowed because there might not actually be one—maybe the original doc was actually UTF-16 encoded, or perhaps it's split across multiple buffers.)
But I wasn't expecting to implement this immediately anyway—it was more a place for discussion and perhaps an eventual "defer/don't/yes" decision. So I think we're on "defer" right now.
Now that we have a (fairly) efficient way of getting at the UTF8 text - this issue now becomes "a good idea".
The two overloads of
Glob.Match
currently takein ReadOnlySpan<char>
, meaning that the text must be in UTF-16 format. Do we need to support matching directly against UTF-8?