RCasatta / blocks_iterator

Iterate over bitcoin blocks
MIT License
51 stars 5 forks source link

iterate API #44

Closed RCasatta closed 2 years ago

RCasatta commented 2 years ago

Use the Iterator trait instead of the model with the channel.

Also extract code from verify example to something like par_iter( Fn(BlockExtra) -> Vec<T>, Fn(T) ) with the second function running in parallel

dpc commented 2 years ago

You might know that already, but just to make sure: one can always turn an receiving a end of a channel into an iterator with .into_iter()

I was working on blocks_iterator-like API where everything can become a clean iterator pipeline:

node_client.recv_blocks().handle_prevouts().write_db();

so I can easily use https://docs.rs/pariter/0.5.1/pariter/ .

But I stumbled on the fact that error handling here is very inconvenient. Then I started playing with https://docs.rs/fallible-iterator/0.2.0/fallible_iterator/ to see if it helps, and then I ran out of free time and stamina.

Just sharing my discoveries.

RCasatta commented 2 years ago

Thanks for sharing.

You might know that already, but just to make sure: one can always turn an receiving a end of a channel into an iterator with .into_iter()

I discovered this like 2 days ago ahahah, and decided to use it to remove the ugly BATCH in my verify example and subsequently move the parallelization logic inside the API.

I used rayon and I didn't know about pariter and fallible_iterator and come up with the following par_iter API, probably not clean iterator pipeline but I think it does the job. Happy to receive feedback.

RCasatta commented 2 years ago

implemented in https://github.com/RCasatta/blocks_iterator/pull/45

dpc commented 2 years ago

implemented in https://github.com/RCasatta/blocks_iterator/pull/45

It works, I guess, though it looks very custom-made and not very idiomatically composable.

The PREPROC could be a .flat_map(...) and TASK could be a .parallel_map(...) if I'm reading it right, so they kind of would be easier to work with if separated, IMO.

The explicit STATE seems unnecessary because each TASK (basically any closure) can already borrow some state internally, so passing STATE explicitily just blows the API surface, AFAICT.

All these 'static + Send + Sync are constraining the calling code, making certain valid usages impossible (like using borrowed variables in these closures) which 99% doesn't matter, but this 1% when it does, the user will complain.

But then, if you start aiming at maximum expressivness and general purpose idomaticness, you'll have to deal with stuff like that, and it's not like I write code like that by heart. There is a lot of grinding, figuring these lifetimes and generics.

I also don't like the unwraps, but iterators returning Result items are not very convenieient to work with, so that leads to falliable_iterator.

So I'm just poining out "in a perfect case" things could be better, but I bailed on implementing these myself, so by no means feel like I am complaining at your code. :D

RCasatta commented 2 years ago

thanks for your feedback, I realized par_iter makes no sense