Open nyurik opened 1 year ago
You may have an easier time if you just use an async threadpool, like tokio provides.
What sort of results come out of your parallel process? The shape of that will affect how you might shoehorn this into rayon.
Thx @cuviper , each operation is essentially a bool -- in a way, I am trying to parallel-ize validation of a large dataset, searching for any row that do not pass validation. So ordering is not important, but would be good to abort early if any validation fails.
One way you could try is with tokio channels, using async send
and sync blocking_recv
something like:
std::iter::from_fn(move || rx.blocking_recv()).par_bridge() //...
If you are using sqlx
and therefore already run within an async runtime providing a thread pool, I also suspect that just chunking the rows using something like StreamExt::chunks
and spawning a task for each chunk would be preferable to trying to force this into a parallel iterator. Alternatively, spawning a task per hardware thread and using an async mpmc channel like flume
or kanal
might work as well.
Crossposting with stackoverflow:
Rust SQLX lib provides an iterator-like interface
fetch(...)
usually used withwhile let Some(row) = rows.try_next().await? {...}
construct. In my case, each row may take some time to process, so I would like to use Rayon'spar_iter()
, but that requires a real iterator. Usingfetch_all
is not an option because all rows may simply not fit into memory.How can I use Rayon's
par_iter
to process millions of rows produced by the sqlx'sfetch
stream?