Closed cdeil closed 4 years ago
@cdeil - I'm not a fan of the definition of image (which sort of carries down to the rest), because of the (data, wcs) tuple option. So I'd like to proposal an alternative, more in line with what @astrofrog and @ericjesche and I were suggesting in #13 (and I think also compatible with @mwcraig's request?) about duck-typing:
An image is either:
NDData
. Practically speaking, that leads to the following as implementation:
def downsample(image, factor):
"""Downsample image by a given factor.
Parameters
----------
image
... description here ...
factor : int
Downsample factor
Returns
-------
out : whatever `image` is
Output
"""
from copy import deepcopy
from skimage.measure import block_reduce
nddata_like_input = False
#need to special-case ndarray, because ndarray *has* a `data` object,
# but it means the actual underlying memory
if hasattr(image, 'data') and not isinstance(image, np.ndarray):
inarray = image.data
nddata_like_input = True
else:
inarray = image
# Compute `out_array`
out_array = block_reduce(inarray, factor)
out_wcs = None
if hasattr(image, 'wcs'):
# Compute `out_wcs`
out_wcs = image.wcs.copy()
if hasattr(wcs_out.wcs, 'cd'):
out_wcs.wcs.cdelt = out_wcs.wcs.cdelt * factor
else:
assert np.all(wcs_out.wcs.cdelt == 1.0)
wcs_out.wcs.pc = wcs_out.wcs.pc * factor
wcs_out.wcs.crpix = ((wcs_out.wcs.crpix-0.5)/factor)+0.5
if nddata_like_input:
if hasattr(input, 'copy'):
out = input.copy()
else:
out = deepcopy(out)
out.data = output_array
if out_wcs is not None:
out.wcs = out_wcs
else:
out = output_array
return out
That is, if an image is passed in, you just do the basic thing. You use hasattr(input, 'whatever')
to check for mask
, wcs
, flags
, etc., and treat them like NDData
equivalents. Then you generate an output that's just whatever the input was.
An alternative would be to replace the copying for the output above with some sort of heuristic like:
out=input.__class__(data=output_array, wcs=out_wcs)
but that means any additional information is lost (like meta
, for example)
Oh, and @cdeil, this document is probably a good place to put the pixel convention a la astropy/astropy#2607, right? So maybe that means holding off on the "conventions in astropy" section?
Relevant github issues and links to NDData/CCDData docs are at: https://docs.google.com/document/d/1yscBWegyQbG2fxmZFg5CAtnhS3Hoyl6VTSSApWAsKX8/edit?usp=sharing
It is editable by anyone, so feel free to add any other links you want.
I like @eteq's version.
For discussion, I include an alternate implementation below that uses a recursive function call (it may not be better than @eteq's version and it probably can be improved). Essentially it unpacks the NDData
-like object, calls the function with the nddarray
/wcs inputs and then creates an NDData
-like object as output (all done in one if
block). It has the advantage of not making copies of the inputs (e.g. data image, and optional mask and uncertainty images). [EDIT: Looking again at @eteq's version, I see they aren't copied there either, so this example may not have any advantage.] In this example I've included optional mask
and uncertainty
inputs, even though they do nothing here. Also, I've included a wcs
optional parameter. To use a recursive call, obviously the function needs to have a parameter for every NDData
-like attribute that is used in the function.
def downsample(image, factor, wcs=None, uncertainty=None, mask=None):
"""Downsample image by a given factor.
Parameters
----------
image
... description here ...
factor : int
Downsample factor
wcs : optional
... description here ...
uncertainty : optional
... description here ...
mask : optional
... description here ...
Returns
-------
out : whatever `image` is
Output
out_wcs : WCS
Returned *only* if input `image` is a `ndarray`
"""
from skimage.measure import block_reduce
# the code in this "if" block handles everything with NDData-like input
# need to special-case ndarray, because ndarray *has* a `data` object,
# but it means the actual underlying memory
if hasattr(image, 'data') and not isinstance(image, np.ndarray):
attribs = ['wcs', 'uncertainty', 'mask']
inputs = {}
for attrib in attribs:
if hasattr(image, attrib):
inputs[attrib] = getattr(image, attrib)
else:
inputs[attrib] = None
out_array, out_wcs = downsample(image.data, factor, wcs=inputs['wcs'],
uncertainty=inputs['uncertainty'],
mask=inputs['mask'])
if hasattr(image, 'copy'):
out = image.copy()
else:
out = deepcopy(image)
out.data = out_array
if out_wcs is not None:
out.wcs = out_wcs
return out
# Compute `out_array`
out_array = block_reduce(image, factor)
out_wcs = None
if wcs is not None:
# Compute `out_wcs`
out_wcs = wcs.copy()
if hasattr(out_wcs, 'cd'):
out_wcs.cdelt *= factor
else:
assert np.all(out_wcs.cdelt == 1.0)
out_wcs.pc *= factor
out_wcs.crpix = ((out_wcs.crpix-0.5) / factor) + 0.5
return out_array, out_wcs
@Cadair -- just sent you an invite to the google hangout related to this. These packages (imageutils/photutils/ccdproc) may not get much use by sunpy, but @crawfordsm pointed out today that you do use NDData extensively. Additional discussion of NDData is in #13
@mwcraig thanks a lot I will try and catch up on all this before Thursday. I didn't even know imageutils existed! I have invited a couple of other SunPy folks to the hangout, I hope that is OK.
The more the merrier! I'll send a note on astropy-dev too.
I'd like to suggest a third approach, which is essentially a two-level scheme. On the first level, in the grand numpy tradition, you have a module level function that takes parameters and optional keyword args, something like @larrybradley 's function signature:
def downsample(image_np, factor, wcs=None, uncertainty=None, mask=None):
...
except that the image parameter in particular is a straight numpy array. All other inputs of interest are broken out into args or optional keyword args. At this level the fundamental algorithm is implemented, the unit tests are straightforward, there are not a lot of special tests for attributes, etc. and it provides the building block for all the second level interfaces. It is also the fastest.
At the second level, you have your OO interface, which simply sets up for making a call to the first level function with the appropriate parameters. This is where any object matching the duck-type interface can be used: an nddata object, a ccdproc object, a ginga astroimage object, etc. This can also be a module level function, but I much prefer an approach based on the factory method pattern (http://en.wikipedia.org/wiki/Factory_method_pattern)--in other words, you create a factory object that can downsample images (and other imageutil operations) for you: if you have a ccdproc object coming in, you get a downsampled ccdproc object coming out, if nddata coming in, nddata coming out, etc. The exact details of how this is accomplished are not so important (check the link for ideas), but suffice it to say that there are a number of approaches by which this can be accomplished.
This two-level approach allows a lot of freedom going forward. Number one, it allows the algorithm to be implemented immediately in its most straightforward form at level one and tested, to get it out there. Secondly, it allows the second level implementation to proceed as a more fluid interface that can be refined going forward, and possibly in a versioned way--so that "older" versions of ccdproc objects can be supported along with newer ones.
I've updated the API spec RST file in this pull request ... you can read the rendered html version here.
The main changes are:
I'll try to join today's hangout later today.
One things that seems appealing about @ejeschke's approach is that it factors out effort -- if a package provider has a specific image class that they want to interface with imageutils then the can write a short factory method to do that and then do a PR.
Caveat: I've never implemented a factory method so don't really know how much effort it is or whether it really factors in this way...
We are currently using an approach like @larrybradley's in ccdprco for its rebin
: https://github.com/astropy/ccdproc/blob/master/ccdproc/core.py#L701
A rather late copy of the notes from last week's hangout -- https://docs.google.com/document/d/1ctjhxQdKvlha0yIte45C4SUCV9OqbAvK_IDW9vg9DYU/edit
especially for @eteq and @ejeschke who were not able to be there, but also @crawfordsm @cdeil @astrofrog @larrybradley @keflavich @perrygreenfield
Thanks @mwcraig for writing the minutes!
Just for the record, since my WIFI connection was too bad to participate in the hangout, my suggestion would have been to remove NDData
. Or at least to split the NDData
handling code (even if via decorators) into separate functions (see my comments at https://github.com/astropy/astropy/pull/2855#issuecomment-52602931 and https://github.com/astropy/astropy/pull/2855#issuecomment-52610106 why).
PS: I think the Google doc it's still set to "everyone with the link can edit" so anyone could change the summary to "NDData should be removed". :-) ... maybe you want to change that?
@cdeil -- changed so that anyone can comment; I actually think it would be helpful to add your opinion either as a comment or I can edit the text.
I'll get a chance to write a substantive response to your comments later tonight, I hope...
If you look at the
imageutils
API discussions in #1, #13 and several other issues it's clear that we need an API spec forimageutils
.This is a first attempt for a proposal, to have a concrete basis for discussion. Nothing is set in stone and this is currently not very good... please comment and if you think something should be changed, please say how you'd like it to work in a comment (or better yet, make a pull request against this one containing the change).
We can turn this RST page into an APE later if having an APE is considered useful (or even necessary) to get
imageutils
accepted into the Astropy core asastropy.image
.One point I'd like to stress is that we will not stall
imageutils
development waiting for this discussion to be resolved! If you have useful image utils functions we will merge them now and if needed refactor them later to match the outcome of this API discussion.