astropy / imageutils

Image processing utilities for Astropy (No longer maintained)
https://imageutils.readthedocs.org/
9 stars 17 forks source link

WIP: API spec for imageutils #18

Closed cdeil closed 4 years ago

cdeil commented 9 years ago

If you look at the imageutils API discussions in #1, #13 and several other issues it's clear that we need an API spec for imageutils.

This is a first attempt for a proposal, to have a concrete basis for discussion. Nothing is set in stone and this is currently not very good... please comment and if you think something should be changed, please say how you'd like it to work in a comment (or better yet, make a pull request against this one containing the change).

We can turn this RST page into an APE later if having an APE is considered useful (or even necessary) to get imageutils accepted into the Astropy core as astropy.image.

One point I'd like to stress is that we will not stall imageutils development waiting for this discussion to be resolved! If you have useful image utils functions we will merge them now and if needed refactor them later to match the outcome of this API discussion.

eteq commented 9 years ago

@cdeil - I'm not a fan of the definition of image (which sort of carries down to the rest), because of the (data, wcs) tuple option. So I'd like to proposal an alternative, more in line with what @astrofrog and @ericjesche and I were suggesting in #13 (and I think also compatible with @mwcraig's request?) about duck-typing:

An image is either:

  1. A numpy array
  2. Something that has the same attributes as NDData.

Practically speaking, that leads to the following as implementation:

def downsample(image, factor):
    """Downsample image by a given factor.

    Parameters
    ----------
    image
        ... description here ...
    factor : int
        Downsample factor

    Returns
    -------
    out : whatever `image` is
        Output
    """
    from copy import deepcopy
    from skimage.measure import block_reduce

    nddata_like_input = False

    #need to special-case ndarray, because ndarray *has* a `data` object, 
    # but it means the actual underlying memory
    if hasattr(image, 'data') and not isinstance(image, np.ndarray):  
        inarray = image.data
        nddata_like_input = True
    else:
        inarray = image

    # Compute `out_array`
    out_array = block_reduce(inarray, factor)

    out_wcs = None
    if hasattr(image, 'wcs'):
        # Compute `out_wcs`
        out_wcs = image.wcs.copy()
        if hasattr(wcs_out.wcs, 'cd'):
            out_wcs.wcs.cdelt = out_wcs.wcs.cdelt * factor
        else:
            assert np.all(wcs_out.wcs.cdelt == 1.0)
            wcs_out.wcs.pc = wcs_out.wcs.pc * factor
            wcs_out.wcs.crpix =  ((wcs_out.wcs.crpix-0.5)/factor)+0.5

    if nddata_like_input:
        if hasattr(input, 'copy'):
            out = input.copy()
        else:
            out = deepcopy(out)

        out.data = output_array
        if out_wcs is not None:
            out.wcs = out_wcs

    else:
        out = output_array

    return out

That is, if an image is passed in, you just do the basic thing. You use hasattr(input, 'whatever') to check for mask, wcs, flags, etc., and treat them like NDData equivalents. Then you generate an output that's just whatever the input was.

An alternative would be to replace the copying for the output above with some sort of heuristic like:

out=input.__class__(data=output_array, wcs=out_wcs)

but that means any additional information is lost (like meta, for example)

eteq commented 9 years ago

Oh, and @cdeil, this document is probably a good place to put the pixel convention a la astropy/astropy#2607, right? So maybe that means holding off on the "conventions in astropy" section?

mwcraig commented 9 years ago

Relevant github issues and links to NDData/CCDData docs are at: https://docs.google.com/document/d/1yscBWegyQbG2fxmZFg5CAtnhS3Hoyl6VTSSApWAsKX8/edit?usp=sharing

It is editable by anyone, so feel free to add any other links you want.

larrybradley commented 9 years ago

I like @eteq's version.

For discussion, I include an alternate implementation below that uses a recursive function call (it may not be better than @eteq's version and it probably can be improved). Essentially it unpacks the NDData-like object, calls the function with the nddarray/wcs inputs and then creates an NDData-like object as output (all done in one if block). It has the advantage of not making copies of the inputs (e.g. data image, and optional mask and uncertainty images). [EDIT: Looking again at @eteq's version, I see they aren't copied there either, so this example may not have any advantage.] In this example I've included optional mask and uncertainty inputs, even though they do nothing here. Also, I've included a wcs optional parameter. To use a recursive call, obviously the function needs to have a parameter for every NDData-like attribute that is used in the function.

def downsample(image, factor, wcs=None, uncertainty=None, mask=None):
    """Downsample image by a given factor.

    Parameters
    ----------
    image
        ... description here ...
    factor : int
        Downsample factor
    wcs : optional
        ... description here ...
    uncertainty : optional
        ... description here ...
    mask : optional
        ... description here ...

    Returns
    -------
    out : whatever `image` is
        Output
    out_wcs : WCS
        Returned *only* if input `image` is a `ndarray`
    """

    from skimage.measure import block_reduce

    # the code in this "if" block handles everything with NDData-like input
    # need to special-case ndarray, because ndarray *has* a `data` object,
    # but it means the actual underlying memory
    if hasattr(image, 'data') and not isinstance(image, np.ndarray):
        attribs = ['wcs', 'uncertainty', 'mask']
        inputs = {}
        for attrib in attribs:
            if hasattr(image, attrib):
                inputs[attrib] = getattr(image, attrib)
            else:
                inputs[attrib] = None
        out_array, out_wcs = downsample(image.data, factor, wcs=inputs['wcs'],
                                        uncertainty=inputs['uncertainty'],
                                        mask=inputs['mask'])

        if hasattr(image, 'copy'):
            out = image.copy()
        else:
            out = deepcopy(image)
        out.data = out_array
        if out_wcs is not None:
            out.wcs = out_wcs
        return out

    # Compute `out_array`
    out_array = block_reduce(image, factor)

    out_wcs = None
    if wcs is not None:
        # Compute `out_wcs`
        out_wcs = wcs.copy()
        if hasattr(out_wcs, 'cd'):
            out_wcs.cdelt *= factor
        else:
            assert np.all(out_wcs.cdelt == 1.0)
            out_wcs.pc *= factor
            out_wcs.crpix = ((out_wcs.crpix-0.5) / factor) + 0.5

    return out_array, out_wcs
mwcraig commented 9 years ago

@Cadair -- just sent you an invite to the google hangout related to this. These packages (imageutils/photutils/ccdproc) may not get much use by sunpy, but @crawfordsm pointed out today that you do use NDData extensively. Additional discussion of NDData is in #13

Cadair commented 9 years ago

@mwcraig thanks a lot I will try and catch up on all this before Thursday. I didn't even know imageutils existed! I have invited a couple of other SunPy folks to the hangout, I hope that is OK.

mwcraig commented 9 years ago

The more the merrier! I'll send a note on astropy-dev too.

ejeschke commented 9 years ago

I'd like to suggest a third approach, which is essentially a two-level scheme. On the first level, in the grand numpy tradition, you have a module level function that takes parameters and optional keyword args, something like @larrybradley 's function signature:

def downsample(image_np, factor, wcs=None, uncertainty=None, mask=None):
    ...

except that the image parameter in particular is a straight numpy array. All other inputs of interest are broken out into args or optional keyword args. At this level the fundamental algorithm is implemented, the unit tests are straightforward, there are not a lot of special tests for attributes, etc. and it provides the building block for all the second level interfaces. It is also the fastest.

At the second level, you have your OO interface, which simply sets up for making a call to the first level function with the appropriate parameters. This is where any object matching the duck-type interface can be used: an nddata object, a ccdproc object, a ginga astroimage object, etc. This can also be a module level function, but I much prefer an approach based on the factory method pattern (http://en.wikipedia.org/wiki/Factory_method_pattern)--in other words, you create a factory object that can downsample images (and other imageutil operations) for you: if you have a ccdproc object coming in, you get a downsampled ccdproc object coming out, if nddata coming in, nddata coming out, etc. The exact details of how this is accomplished are not so important (check the link for ideas), but suffice it to say that there are a number of approaches by which this can be accomplished.

This two-level approach allows a lot of freedom going forward. Number one, it allows the algorithm to be implemented immediately in its most straightforward form at level one and tested, to get it out there. Secondly, it allows the second level implementation to proceed as a more fluid interface that can be refined going forward, and possibly in a versioned way--so that "older" versions of ccdproc objects can be supported along with newer ones.

cdeil commented 9 years ago

I've updated the API spec RST file in this pull request ... you can read the rendered html version here.

The main changes are:

  1. Updated the original proposal
  2. Added @eteq's proposal from the comment above as proposal 4
  3. Added @larrybradley's proposal from the comment above as proposal 5
  4. Added some of the questions I have to the to be discussed section at the end.

I'll try to join today's hangout later today.

mwcraig commented 9 years ago

One things that seems appealing about @ejeschke's approach is that it factors out effort -- if a package provider has a specific image class that they want to interface with imageutils then the can write a short factory method to do that and then do a PR.

Caveat: I've never implemented a factory method so don't really know how much effort it is or whether it really factors in this way...

mwcraig commented 9 years ago

We are currently using an approach like @larrybradley's in ccdprco for its rebin: https://github.com/astropy/ccdproc/blob/master/ccdproc/core.py#L701

mwcraig commented 9 years ago

A rather late copy of the notes from last week's hangout -- https://docs.google.com/document/d/1ctjhxQdKvlha0yIte45C4SUCV9OqbAvK_IDW9vg9DYU/edit

especially for @eteq and @ejeschke who were not able to be there, but also @crawfordsm @cdeil @astrofrog @larrybradley @keflavich @perrygreenfield

cdeil commented 9 years ago

Thanks @mwcraig for writing the minutes!

Just for the record, since my WIFI connection was too bad to participate in the hangout, my suggestion would have been to remove NDData. Or at least to split the NDData handling code (even if via decorators) into separate functions (see my comments at https://github.com/astropy/astropy/pull/2855#issuecomment-52602931 and https://github.com/astropy/astropy/pull/2855#issuecomment-52610106 why).

PS: I think the Google doc it's still set to "everyone with the link can edit" so anyone could change the summary to "NDData should be removed". :-) ... maybe you want to change that?

mwcraig commented 9 years ago

@cdeil -- changed so that anyone can comment; I actually think it would be helpful to add your opinion either as a comment or I can edit the text.

I'll get a chance to write a substantive response to your comments later tonight, I hope...