Provide a way for the nddata/ccddata integration in apply_transform to return the same objects

quatrope / astroalign

A tool to align astronomical images based on asterism matching

MIT License

136 stars 42 forks source link

Provide a way for the nddata/ccddata integration in apply_transform to return the same objects #34

Open eteq opened 4 years ago

eteq commented 4 years ago

The improved integration with CCDData/NDData (i.e., things with mask and data) are great, so was glad to see that!

However, the example shown in the docs (https://astroalign.readthedocs.io/en/latest/tutorial.html#dealing-with-data-objects-with-data-and-mask-properties-nddata-ccddata-numpy-masked-arrays) prompts an idea: I wonder if apply_transform can be adapted to actually return the appropriate objects? I.e., ideally instead of:

aligned_image, footprint = aa.apply_transform(transf, nd, nd, propagate_mask=True)
new_nd = NDData(aligned_image, mask=footprint)

it would be more convenient for users if the following worked:

new_nd =  aa.apply_transform(transf, nd, ...)

That is, one of the main motivations for NDData is so that users do not have to worry about things like paying close attention to how they pass in masks.

Does that seem reasonable? I.e., is it reasonable to have apply_transform give different things depending on the input? I find that a natural pattern, but an alternative might be to have a separate method apply_transform_datamask or something like that. Or another option might be a special keyword.

leliel12 commented 4 years ago

This idea is reasonable (mostly for the astropy integration) but here my concerns.

Write a function that return a type determined by the input, increment the potential bugs. The main problem is write 'sufficient' test to ensure the behavior. Maybe only limited number of types can be accepted, but still if the user inherit from any of this types can be problem.
IMHO add astropy as dependency for only 1 line of code it's unpractical at least. Yes, probably a anyone who uses astroalign also uses astropy; but I sense a "code-smell" here.

Maybe I don't totally understand the entire uses-cases of NDData, and maybe a more useful integration can be write.

martinberoiz commented 4 years ago

Thanks Erick for your suggestion!

That was actually my intention at first when I wrote the function. I wanted to implement something like that, transparent to the user.

But then I ran into the problem of creating an object of the same class as the one passed as input. I need some kind of factory method for that because I want to keep the input as general as possible.

Let me explain, the input could be CCDData, NDData or numpy (maybe masked) array. Or more general, any object that has a data property and an optional mask property. For each class I would have to adjust the output accordingly to match to the input class and return an object of the same class as the input.

That would be ideal and maybe I could potentially do something with meta programming and introspection? (of which I don't really know much).

So in the end, I chose to return the most vanilla class (the Numpy array), so that the user itself can reconstruct the object of whatever class is needed. I know this requires a bit more knowledge from the user, and it may not be the best. I tried to compensate by explaining how in the docs.

Do you have any suggestion on how to implement something like that, that returns objects based on their input class?

eteq commented 4 years ago

@martinberoiz

Do you have any suggestion on how to implement something like that, that returns objects based on their input class?

I see your point here. But I think maybe we can have it both ways by using heuristics and duck-typing to get us 80% of the way there (which is maybe far enough). I'm thinking something like this inside apply_transform:

data = input.data
mask = getattr(input, 'mask' , None)
if propogate_mask is 'guess' and mask is not None:  # <- this would be the new default for propogate_mask
    propogate_mask = True

... all the existing stuff which eventually yields aligned_data, footprint ...

extrakwargs = {}
if hasattr(input, 'mask'):
    extrakwargs['mask'] = footprint

if hasattr(input, 'uncertainty') and input.uncertainty is not None:
    warnings.warn('uncertainty is being dropped in apply_transform because it's not clear what to do with it')

if hasattr(input, 'wcs'):
    extrakwargs['wcs'] = ... calculate a wcs ...?  But that might be more work for now than practical

for attrnm in ['meta', 'unit']: # attributes that pass through without change
    if hasattr(input, attrnm'):
        extrakwargs[attrnm] = input[attrnm]

return input.__class__(aligned_data, **extrakwargs), footprint

Make sense? One could imagine adding more heuristics, but this at least captures NDData and all of its derivative types.

@leliel12

IMHO add astropy as dependency for only 1 line of code it's unpractical at least

Yep, I'm with you there, but in my scheme above there's no need for a formal dependency. It should "just work" without a

martinberoiz commented 3 years ago

Sorry for unearthing this issue so long after. I thought about it more and even though I like the idea, I think this behavior of returning objects of the same type as the input's makes more sense if astroalign functionality is inside astropy. Then, I agree that it could be an expected and desirable feature to have.

But implemented in astroalign, as long as it remains a separate package, it would make me go into the internals of NDData or CCDData and update or maintain code in response to changes in these classes, it would create some kind of implicit coupling that I'm not convinced it's good for astroalign.

So I propose this be like a sub-issue for #43 and focus on that one instead.