Closed NickRedwood closed 3 years ago
Hi! Thanks for dropping by.
Porting this all to async
for this purpose would be really interesting; I don't think we'd do that in this project directly as I think some substantial changes would be required before it would be useful (e.g. incremental tokenization and parsing, since anything being received in chunks via async I/O could be quite large).
If you decide to take a shot at an async version I'd love to check it out - keep us posted! :-)
Ok, I'll put it on my list of interesting projects to try some time! I expect it wouldn't be too difficult to make some progress on tokenizing, however it would be a lot more challenging and time consuming to work through async-ifying the rest of the library.
Hi, firstly thanks for writing this library, I've found it very useful.
Are there any plans to support tokenizing (firstly) of data sources that may be accessed asynchronously i.e. from a
Stream
of some type? i.e. the code reads from a stream, probably some buffer-size at a time, but every so often requires an async call to get more data.As I understand it,
ValueTask
would be a zero-overhead way of achieving this and so essentially we want the incoming data to be anIAsyncEnumerable
, and it returns anotherIAsyncEnumerable
.I imagine the async counterpart to a method like:
public TokenList<TKind> Tokenize(string source)
would be:
public IAsyncEnumerable<TKind> Tokenize(IAsyncEnumerable<char> source)
Obviously
ValueTask
would then permeate much of the rest of the codebase too, but for now I'm just looking at the tokenizer part.As the library doesn't support this currently, are you able to provide any pointers on tokenizing in chunks, but capturing any un-tokenized remaining string at the end of each chunk?
public Result<TokenList<TKind>> TryTokenize(string source)
doesn't seem to be the signature that will work - I would need theResult
(or other success/failure type) to be inside theTokenList
rather than the other way around. Overall I'm unsure if I can make this work within the existing framework, or if it'd be less work to write my ownAsyncTokenizer
.Thanks.