Blosc / bcolz

A columnar data container that can be compressed.
http://bcolz.blosc.org
959 stars 149 forks source link

difference between iterblocks and nchunks #388

Open simonm3 opened 5 years ago

simonm3 commented 5 years ago

From the code below there are 62 blocks but nchunks is 61. Why is that?

import bcolz
a=bcolz.carray(np.array([1,2,3,4]*999999))
print(len(list(bcolz.iterblocks(a))), a.nchunks)
FrancescAlted commented 5 years ago

Yes, that's intended. a.nchunks is the number of chunks that are complete, whereas iterblocks iterates over the remainder block.

simonm3 commented 5 years ago

Thanks. Would be useful to add something to the docs on the difference between blocks and chunks. I searched for it but didn't find anything.

On Thu, 10 Jan 2019 at 10:17, Francesc Alted notifications@github.com wrote:

Yes, that's intended. a.nchunks is the number of chunks that are complete, whereas iterblocks iterates over the remainder block.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Blosc/bcolz/issues/388#issuecomment-453043566, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJN6V66wPnpYcMm_1Q7ZqzTh_ob0QNhks5vBxNJgaJpZM4Z4Jg- .