Closed boazmohar closed 8 years ago
@boazmohar series.toimages()
doesn't actually use chunking.
The standard assumption is that the # of pixels per image is much larger than the number of time points. Chunking only really helps when the non-distributed dimension is so large that the number of groupings we need to do for the transpose is prohibitive. So chunking was really only meant to speed up the images-to-series conversion, but not vice versa. That said:
So it might be worth it to have Series.toimages
go through Blocks
in the same manner that Images.toseries
does.
Investigated this further with @boazmohar offline. images.toseries
uses the intermediate Blocks
object to determine an optimal block-size for the transpose. However series.toimages
relies only on Bolt's default setting for chunk-size. For datasets where the # of time points is of the same order or greater than the number of pixels, this makes images.toseries
both inefficient and results in terrible partitioning in the resulting Images
object. Seems like the best solution is make series.toimages
as much of a mirror operation to images.toseries
as possible.
It seems that there is now a difference between
images.toseries()
andseries.toimages()
default chunk size behavior. Shouldn't they be the same? @freeman-lab @jwittenbach