astropy / reproject

Python-based Astronomical image reprojection :milky_way: - maintainer @astrofrog
https://reproject.readthedocs.io
BSD 3-Clause "New" or "Revised" License
106 stars 65 forks source link

Make sure reprojection routines work with large images #37

Open astrofrog opened 10 years ago

astrofrog commented 10 years ago

When reprojecting large images, we may run into memory issues for storing the corners or centers of all pixels, so we should have a way to project chunk by chunk (for all reprojection methods)

cdeil commented 10 years ago

Processing arrays in chunks is a common issue ... I wonder if view-as-blocks would work efficiently here or if not we could figure out a utility that can be reused elsewhere.

astrofrog commented 10 years ago

@cdeil - that looks like it could indeed help!

astrofrog commented 7 years ago

I think we should figure out a solution for this. One easy way would be to implement this in the high level interpolation reprojection routine and produce different WCS objects (with updated CRPIX values) and separate shape_out objects for each chunk, and then call _reproject_celestial and _reproject_full multiple times. We could even call this in parallel if it looks like it would speed things up.

astrofrog commented 7 years ago

Here's a very basic form of chunking:

import numpy as np

from astropy.extern import six

from reproject.utils import parse_input_data, parse_output_projection
from reproject.interpolation.core_celestial import _reproject_celestial
from reproject.interpolation.high_level import ORDER

def reproject_interp_chunk_2d(input_data, output_projection, shape_out=None, hdu_in=0,
                              order='bilinear', blocks=(1000, 1000)):
    """
    For a 2D image, reproject in chunks
    """

    array_in, wcs_in = parse_input_data(input_data, hdu_in=hdu_in)
    wcs_out, shape_out = parse_output_projection(output_projection, shape_out=shape_out)

    if isinstance(order, six.string_types):
        order = ORDER[order]

    # Create output arrays
    array = np.zeros(shape_out, dtype=float)
    footprint = np.zeros(shape_out, dtype=float)

    for imin in range(0, array.shape[0], blocks[0]):
        imax = min(imin + blocks[0], array.shape[0])
        for jmin in range(0, array.shape[1], blocks[1]):
            jmax = min(jmin + blocks[1], array.shape[1])
            shape_out_sub = (imax - imin, jmax - jmin)
            wcs_out_sub = wcs_out.deepcopy()
            wcs_out_sub.wcs.crpix[0] -= jmin
            wcs_out_sub.wcs.crpix[1] -= imin
            array_sub, footprint_sub = _reproject_celestial(array_in, wcs_in, wcs_out_sub,
                                                            shape_out=shape_out_sub, order=order)
            array[imin:imax, jmin:jmax] = array_sub
            footprint[imin:imax, jmin:jmax] = footprint_sub

    return array, footprint

if __name__ == '__main__':

    from astropy.io import fits

    hdu_in = fits.open('/Users/tom/Data/Images/2MASS_j.fits')[0]
    hdu_out = fits.open('/Users/tom/Data/Images/MSX_E.fits')[0]

    array, footprint = reproject_interp_chunk_2d(hdu_in, hdu_out.header, blocks=(1000, 1000))

    fits.writeto('array.fits', array, overwrite=True)
    fits.writeto('footprint.fits', footprint, overwrite=True)

@mhardcastle - could you try this out to see if it helps alleviate your issues with the very large image?

If this works, we can then implement this as an option in the proper reproject_interp, generalize to n-dimensions, and optionally parallelize.

mhardcastle commented 7 years ago

Sorry for the delay in checking this. I can confirm that it does work and essentially solves the problem with the single large image. Parallelizing this is very easy with multiprocessing.Pool:

from multiprocessing import Pool

def reproject_worker(a):
    imin,imax,jmin,jmax,array_in,wcs_in,wcs_out_sub,shape_out_sub,order=a
    print 'worker:',imin,jmin
    array_sub, footprint_sub = _reproject_celestial(array_in, wcs_in, wcs_out_sub,
                                                    shape_out=shape_out_sub, order=order)
    return imin,imax,jmin,jmax,array_sub,footprint_sub

def reproject_interp_chunk_2d_multi(input_data, output_projection, shape_out=None, hdu_in=0,
                              order='bilinear', blocks=(1000, 1000)):
    """
    For a 2D image, reproject in chunks
    """

    array_in, wcs_in = parse_input_data(input_data, hdu_in=hdu_in)
    wcs_out, shape_out = parse_output_projection(output_projection, shape_out=shape_out)

    if isinstance(order, six.string_types):
        order = ORDER[order]

    # Create output arrays
    array = np.zeros(shape_out, dtype=float)
    footprint = np.zeros(shape_out, dtype=float)
    # Arguments for pool
    args=[] 
    print 'chunking for pool'
    for imin in range(0, array.shape[0], blocks[0]):
        imax = min(imin + blocks[0], array.shape[0])
        for jmin in range(0, array.shape[1], blocks[1]):
            print imin,jmin
            jmax = min(jmin + blocks[1], array.shape[1])
            shape_out_sub = (imax - imin, jmax - jmin)
            wcs_out_sub = wcs_out.deepcopy()
            wcs_out_sub.wcs.crpix[0] -= jmin
            wcs_out_sub.wcs.crpix[1] -= imin
            args.append((imin,imax,jmin,jmax,array_in,wcs_in,wcs_out_sub,shape_out_sub,order))

    print 'pooling'
    p=Pool(16)
    results=p.map(reproject_worker,args)
    print 'assembling final image'
    for r in results:
        imin,imax,jmin,jmax,array_sub,footprint_sub=r
        array[imin:imax, jmin:jmax] = array_sub
        footprint[imin:imax, jmin:jmax] = footprint_sub

    return array, footprint
astrofrog commented 7 years ago

@mhardcastle - thanks for confirming this! I'll try and get a PR open with something like this, and maybe automatically chunking things above a certain size

mhardcastle commented 7 years ago

Great!

Just to note that while the multiprocessing code I posted works, it doesn't give a significant speedup -- I guess because of some locking issue. I've seen this sort of thing before. Will see if I can find a workround. But for the moment, the specific problem I was dealing with is unblocked again.

AlistairSymonds commented 4 years ago

Just came across this issue when trying to put together a silly large widefield mosaic which even memory mapped arrays failed at halfway through the pix2pix roundtrip.

I basically just shimmed a slightly modified reproject_chunking inbetween the the current two functions used for reproject_interp. If there's still interest I can clean it up bit more and do a proper pull request. See: https://github.com/AlistairSymonds/reproject/commit/d82b56b0bebc568caf682f510e67d3f933aad3da

I think I would move all the memory mapped arrat and footprint generation code out of _reproject_full and into the chunking function. Then if it can be chunked to arbitrarily small sizes the actual reprojection could happen entirely in memory? Critiques/guidance welcome, my experience is one of general image processing not astronomy so I don't to miss any gotchas.

(@astrofrog I hope you don't mind this ping, I've noticed this issue is a little old and don't want it to slip by)

quick edit: would it be better to create a generic chunking function can wrap all the current reprojection methods?

astrofrog commented 4 years ago

Thanks for the ping!

quick edit: would it be better to create a generic chunking function can wrap all the current reprojection methods?

Yes I think that might be the best approach - would you have any time to investigate this?

AlistairSymonds commented 4 years ago

Definitely going to give it a shot, I'm just trying to think about how to handle memory mapped arrays since this way memory mapped and multiprocessing support could be added to all the functions.

array_out, footprint_out = reproject_blocks(reprojct_func, **kwargs, block_size=(4000,4000), array_out=None, footprint_out=None, parallel=0)

where kwargs are just passed to the reproject_func given, footprint_out and array_out are used for passing memory mapped arrays in, default optimal block_size would be determined by running a few benchmarks probably and parallel for number of processes in ProcessPoolExecutor or similar.

Another edit: its pretty easy to wrap the non-healpix functions in that, but the healpix may need a bit more work to get it done cleanly

AlistairSymonds commented 4 years ago

I've done a version for the non-healpix functions in the utils package, first off I'm not sure if this the correct place it makes sense to me

https://github.com/AlistairSymonds/reproject/blob/dev_reproject_block/reproject/utils.py

It takes the selected function as an argument then unpacks the other arguments as needed, critiques are very welcome as most of python code has been 'research quality' not 'library quality'. Additionally I haven't made an formal tests (partly because the utils package doesn't seem to have an area for them), only quick and dirty stuff with the file I accidentally committed to my branch and the larger mosaicking project I linked earlier.

Multiprocessing is also something I've only implemented and tested with regards to correctness, I can confirm it works with memmap numpy arrays and produces the right output but aren't sure the impacts of pickling and unpickling the or any contention issues (I only have a quad core to test with too).

AlistairSymonds commented 4 years ago

@astrofrog I'll give you another ping because this is pretty much done functionality wise, I didn't run out of memory when dealing with 100k * 200k output projections and the multiprocessing code scales roughly in line with number of processors.

Another idea I've had is rasterising to detect empty blocks, so we don't waste time on reprojecting to those areas, this is especially apparent for me since I'm trying to reproject many small images onto one large output plane, so a lot of time is wasted on empty blocks and pixels in the output.

  1. Transform the corner pixels from input to output locations (astropy pixel_to_skycoord and skycoord_to_pixel)
  2. Get XY bounding box
  3. Iterate over blocks which lie in that bounding box, add own which contain output pixels in any of their four corners to a list of blocks to process
  4. perform reprojection for all of said blocks.

Do you see any blatant issues with that? I'm not super versed on astropy

Dealing with the slightly args used by each of the different reproject functions is the tough part imo, I've got them being passed in as kwargs but then the functions themself don't accept **kwargs, so a a big manual unpacking statement might be unavoidable.

Projecting from healpix worked, but I can't think of a good way to project to healpix without the special case unpack depending on which function was was passed in, kind of ruining some of the genericness in the wrapper.

Then in terms of tests/docs, should it be in utils still or be move to its own file? (possibly under a reproject_blocked folder)

benchmark script: https://github.com/AlistairSymonds/astro-project-patchwork/blob/master/reproject_benchmarking.py

Once again, any criticism is more than welcome :)

edit: I also just noticed the current code has some dodgy print debug statements still in it, so if you're looking at something in confusion that's probably why.

astrofrog commented 4 years ago

@AlistairSymonds - thanks for working on this! I'll try and take a look very shortly.

astrofrog commented 4 years ago

@AlistairSymonds - could you open a pull request from your branch to this repository as it will make it easier to review the code and leave comments? Thanks!