Closed facundominguez closed 6 years ago
Packages using jvm-streaming
now need to point to the jvm-batching.jar
when building and running.
Open question: how do we make the batch size of streams or vectors a parameter?
We may want to make the batches explicit in sparkle. For instance:
mapIterator
:: (ReifyBatcher a, ReflectBatcher b)
=> Int
-> (Stream (Of (Vector a)) IO () -> Stream (Of (Vector b)) IO ())
-> Dataset a
-> Dataset b
where the Int
parameter gives the size of the batches in the input, and the size of the batches in the output is controlled by the user-supplied function.
This has the advantage that we can tell the user to not leave a batch half consumed before producing and output batch if the type a
contains local references. Otherwise, the control may return to java invalidating those values.
Addressed feedback provided in a private discussion.
The last commit allows building packages even if they have dependencies on jars produced by Haskell dependencies. The user is responsible for pulling all the necessary dependencies in the Setup.hs script.
With this change, we can keep jvm-streaming
in hackage.
Addresses the inline-java part of https://github.com/tweag/sparkle/issues/124.