Open brad-richardson opened 3 months ago
A streaming reader sounds great. Do you think it would be possible to implement the needed functionality in this crate without adding a big dependency? Maybe it can also be done as an optional dependency or there are just some missing low-level pieces that can be added to the public interface, so that the AsyncBlobReader
can be implemented in another crate.
I am not very familiar with the async ecosystem but it looks like you would need to collect the bytes of a stream until you get a full Blob
(at most 32 MB in size). Here is the relevant function for that: https://github.com/b-r-u/osmpbf/blob/fd55e640c274f3fdec81e4ff94fca92578ee3922/src/blob.rs#L265
It reads a u32
header size and then reads the BlobHeader
which includes the size of the following Blob
.
I hope this helps!
I'm looking at adding support for streamed PBF reads from network sources (in my case, from S3). My current plan is to wrap a bytestream produced by
object_store
into something like anAsyncBlobReader
. Are streamed reads something you'd be interested in for this library? If so, do you have any suggestions for implementation?I did consider using something like mountpoint-s3 instead, but unfortunately I'm working in a managed container environment that doesn't support FUSE mountpoints so I'll need to manage the reads myself.
P.S. Thanks for the library, been using it with good results in a little PBF transcoder.