maciejhirsz / logos

Create ridiculously fast Lexers
https://logos.maciej.codes
Apache License 2.0
2.89k stars 118 forks source link

Read #159

Open alex-ozdemir opened 4 years ago

alex-ozdemir commented 4 years ago

Hi there!

Thanks for making such a great library! I'm really impressed with how nice its interface is, and how well it performs.

I'm interested in using Logos in a new way: to tokenize files that are too big to fit in memory.

The "output" interface of Logos (a token stream) is compatible with this, but the "input" interface (str or [u8]) is not. I'd have to get Logos to work over an interface like std::io::Read, or perhaps std::ioBufRead.

I suspect doing this would involve hacking logos quite a bit, since the std::io::{Read,BufRead} and Source traits don't seem quite compatible right now. In particular, Source basically gives random access to the buffer, although, I assume this isn't really needed.

I wanted to ask if you've ever thought about taking Logos in this direction, and if you have any advice regarding the best way to do so.

To be clear: this isn't a feature request: it's more of a "wisdom request": I'd like to here your thoughts on how possible this would be, and would the best approach would be. I'm more than willing to do it myself.

Also, if this works out, I'm more than happy to submit a PR, if you would like that.

SilverPhoenix99 commented 3 years ago

I'm fairly new to Rust, and I have a similar need for this, but as far as I could tell, and with some work, I believe the Source trait is able to wrap around a memory-mapped or random-access file. Regarding the how, I'd also like a ride on the "wisdom wagon".

dullbananas commented 3 years ago

With the current Source trait, it can be implemented on a file stream, but not an async one

wyatt-herkamp commented 2 years ago

I am also interested in the idea of Source being implemented for io::Read or io::BufRead

A few problems I noticed in my quick look around

Source::read and read_unchecked are non mutable. io::Read needs mutable

Source::len couldn't be implemented for io::Read from what I understand

makapuf commented 5 months ago

this seems to be linked to #324