meggart / DiskArrays.jl

Other
73 stars 13 forks source link

fix generators #107

Closed rafaqz closed 1 year ago

rafaqz commented 1 year ago

This PR defines a DiskGenerator to reorder collect during iteration. This fixes one of the last broken tests.

Closes #106

This wont work if a filter is used in the generator so adds another broken test. This would be very hard to do as we don't know the final size or indices at the start. Probably we should disallow it somehow.

meggart commented 1 year ago

Thanks a lot for this. I have pushed a small fix that reduces the number of getindex-calls from 7 to 6. Otherwise LGTM.

meggart commented 1 year ago

BTW I would have no problems dropping julia 1.6 support for DiskArrays.

meggart commented 1 year ago

his wont work if a filter is used in the generator so adds another broken test. This would be very hard to do as we don't know the final size or indices at the start. Probably we should disallow it somehow.

I think this is an edge case, in the end Set([aa for aa in a_disk if aa > 40]) == Set([aa for aa in a if aa > 40]) evaluates to true. In my opinion when using a filtered iterator of unknown length, you can not rely on a certain element being in a certain position of the array anyway. Probably you are right and one should just throw an error, but in some situations it might still be useful to be able loop over a filtered diskarray, knowing the order will be changed. A related DiskArray bug I just discovered is this one:

a = collect(reshape(1:90, 10, 9))
a_disk = _DiskArray(a; chunksize=(5, 3));
issorted(a_disk) #false
issorted(a) #true
rafaqz commented 1 year ago

Maybe we can just clearly document where we break with the AbstractArray interface instead of throwing errors.