Closed fbunt closed 1 year ago
Quick example:
import dask.array as da
import numpy as np
from affine import Affine
def dummy(x, aff):
print(aff)
return x
x = da.from_array(np.ones((4, 4)), chunks=(2, 2))
a = Affine(1.0, 0.0, 0.0, 0.0, -1.0, 4.0)
da.map_blocks(dummy, x, aff=a).compute()
@fbunt thanks for reporting this! I'm not sure what to do about this at the moment. Being a namedtuple was supposed to be only an implementation detail for affine, but clearly that was not much of a strategy.
If namedtuple is only an implementation detail of affine, would using something like array.array (https://docs.python.org/3/library/array.html) be a good alternative?
@groutr well, I was thinking to use attrs or a data class instead.
My understanding was that affine sought to be free of dependencies outside of Python's standard library. Data classes are a good idea. Affine doesn't seem to specify any minimum version of Python, but using data classes would mean Python >=3.7 (which probably isn't a big deal). Slotted data classes weren't introduced until Python 3.10, so each data class instance would have an instance dictionary. But they can be frozen and have named attributes so mimic the behavior of a namedtuple.
The suggestion to use array.array
was because it could be used to enforce the type of all the values, casting as necessary. It also preserves some of the behavior of a named tuple such as slicing and unpacking. Additionally, array.array
would play nicely with numpy's support of the buffer protocol (ie, creating an array view of a transform without having to make a copy, but ). Arrays have their drawbacks, mainly mutability. Care would have to be taken in regards to mutability as changing the numpy array changes the array).
Ultimately, you know affine's design goals and future far better than I do.
leaving a comment here to bump this issue as I just found the same bug.
without looking deeply or testing may I suggest making g,h,i optional parameters of new may solve this issue?
def __new__(cls, a, b, c, d, e, f, g=0.0, h=0.0, i=1.0):
"""Create a new object
Parameters
----------
a, b, c, d, e, f, g, h, i : float (g, h, i are optional)
Elements of an augmented affine transformation matrix.
"""
mat3x3 = [x * 1.0 for x in [a, b, c, d, e, f, g, h, i]]
return tuple.__new__(cls, mat3x3)
@artttt that's a great suggestion! I'm trying it out in #92.
@fbunt any chance you could try the code in PR #92 out?
A recent change (dask/dask#9361) in how dask handles arguments that are instances or subclasses of
namedtuple
preventsAffine
objects from being passed to delayed operations such asdask.array.map_blocks
. Dask now unpacks the namedtuples and then tries to repack them before providing them to the delayed function at compute time. This breaks for Affine objects since they unpack to the full 9 coefficients butAffine.__new__
only takes the first 6 coefficients. The result iswhen dask graphs are computed.
My current solution is to pass the coefficients in place of the instance but it would be great to simply pass Affine objects again.