esheldon / fitsio

A python package for FITS input/output wrapping cfitsio
GNU General Public License v2.0
133 stars 57 forks source link

Overwrite subset of image #284

Open PhilippaHartley opened 4 years ago

PhilippaHartley commented 4 years ago

Hi,

I would like to overwrite a subset of an existing image so that the new, smaller, image appears as a rectangle of new values. Instead, I find that the new values appear in rows wrapped around the original image, starting at the place of the 'start' pixel.

Example code:

  img=np.zeros((100,100),dtype='i4')
  fitsf = FITS('test.fits','rw')
  fitsf.write(img)
  img2=np.arange(3*3,dtype='i4').reshape(3,3)
  fitsf[0].write(img2, start=[2,2])
  fitsf.close()

Thanks,

Philippa

PhilippaHartley commented 4 years ago

I'm using Python3 and the same behaviour happens on either a scientific linux OS or a mac.

PhilippaHartley commented 4 years ago

I dug into the code and found that, inside image.py, the write function calls self._FITS.write_image(self._ext+1, img_send, offset+1), which in fitsio_pywrap.py is wrapping around the C function 'fits_write_image' which does not have the 'rectangle' functionality. The C function that does have this is called 'fits_write_subset'. So I've written a very hacky extra wrapper function to exploit this and it seems to work. I haven't implemented the expand_if_needed functionality inside it yet. Here are the relevant bits of code that I changed:

in image.py:

def write(self, img,  blc0=5, blc1 = 5, trc0 = 15, trc1 = 15,**keys):
        """
        Write the image into this HDU

        If data already exist in this HDU, they will be overwritten.  If the
        image to write is larger than the image on disk, or if the start
        position is such that the write would extend beyond the existing
        dimensions, the on-disk image is expanded.

        parameters
        ----------
        img: ndarray
            A simple numpy ndarray
        start: integer or sequence
            Where to start writing data.  Can be an integer offset
            into the entire array, or a sequence determining where
            in N-dimensional space to start.
        """

        if not keys:
            import warnings
            warnings.warn(
                "The keyword arguments '%s' are being ignored! This warning "
                "will be an error in a future version of `fitsio`!",
                DeprecationWarning)

        dims = self.get_dims()

        if img.dtype.fields is not None:
            raise ValueError("got recarray, expected regular ndarray")
        if img.size == 0:
            raise ValueError("data must have at least 1 row")

        # data must be c-contiguous and native byte order
        if not img.flags['C_CONTIGUOUS']:
            # this always makes a copy
            img_send = numpy.ascontiguousarray(img)
            array_to_native(img_send, inplace=True)
        else:
            img_send = array_to_native(img, inplace=False)

        if IS_PY3 and img_send.dtype.char == 'U':
            # for python3, we convert unicode to ascii
            # this will error if the character is not in ascii
            img_send = img_send.astype('S', copy=False)

        # see if we need to resize the image
      #  if self.has_data():
      #      self._expand_if_needed(dims, img.shape, start, offset)

        self._FITS.write_subset(self._ext+1, img_send, blc0, blc1, trc0, trc1)

        self._update_info()

in fitsio_pywrap.c:

// write the image to an existing HDU created using create_image_hdu
// dims are not checked
static PyObject *
PyFITSObject_write_subset(struct PyFITSObject* self, PyObject* args) {
    int hdunum=0;
    int hdutype=0;

    PY_LONG_LONG blc0_py=0;
    LONGLONG blc0=0;
    PY_LONG_LONG blc1_py=0;
    LONGLONG blc1=0;
    PY_LONG_LONG trc0_py=0;
    LONGLONG trc0=0;
    PY_LONG_LONG trc1_py=0;
    LONGLONG trc1=0;

    int image_datatype=0; // fits type for image, AKA bitpix
    int datatype=0; // type for the data we entered

    PyObject* array;
    void* data=NULL;
    int npy_dtype=0;
    int status=0;

    fprintf(stderr, "hello1\n");

    if (self->fits == NULL) {
        PyErr_SetString(PyExc_ValueError, "fits file is NULL");
        return NULL;
    }
    fprintf(stderr, "hello\n");
    if (!PyArg_ParseTuple(args, (char*)"iOLLLL", &hdunum, &array, &blc0_py, &blc1_py,  &trc0_py, &trc1_py)) {
        fprintf(stderr, "here\n" );
        return NULL;
    }
    fprintf(stderr, "hello\n");
    if (fits_movabs_hdu(self->fits, hdunum, &hdutype, &status)) {
        set_ioerr_string_from_status(status);
        return NULL;
    }

    if (!PyArray_Check(array)) {
        PyErr_SetString(PyExc_TypeError, "input must be an array.");
        return NULL;
    }
    fprintf(stderr, "hello\n");
    npy_dtype = PyArray_TYPE(array);
    if (npy_to_fits_image_types(npy_dtype, &image_datatype, &datatype)) {
        return NULL;
    }

    data = PyArray_DATA(array);

    blc0 = (LONGLONG) blc0_py;
    blc1 = (LONGLONG) blc1_py;
    trc0 = (LONGLONG) trc0_py;
    trc1 = (LONGLONG) trc1_py;
    if (fits_write_subset(self->fits, datatype, blc0, blc1, trc0, trc1, data, &status)) {
        set_ioerr_string_from_status(status);
        return NULL;
    }
    fprintf(stderr, "hello\n");
    // this is a full file close and reopen
    if (fits_flush_file(self->fits, &status)) {
        set_ioerr_string_from_status(status);
        return NULL;
    }

    Py_RETURN_NONE;
}

in longnam.h, fits_write_subset is already defined as ffpss

in putcol.c:

int ffpss(  fitsfile *fptr,   /* I - FITS file pointer                       */
            int  datatype,    /* I - datatype of the value                   */
            LONGLONG  blc0,     /* I - number of values to write               */
            LONGLONG  blc1,     /* I - number of values to write               */
            LONGLONG  trc0,     /* I - number of values to write               */
            LONGLONG  trc1,     /* I - number of values to write               */
            void *array,      /* I - array of values that are written        */
            int  *status)     /* IO - error status                           */
/*
  Write a section of values to the primary array. The datatype of the
  input array is defined by the 2nd argument.  Data conversion
  and scaling will be performed if necessary (e.g, if the datatype of
  the FITS array is not the same as the array being written).

  This routine supports writing to large images with
  more than 2**31 pixels.
*/

{
    long blc[2];
    long trc[2];

    blc[0] = blc0;
    blc[1] = blc1;
    trc[0] = trc0;
    trc[1] = trc1;

    int naxis;
    long naxes[9];

    if (*status > 0)   /* inherit input status value if > 0 */
        return(*status);

    /* get the size of the image */
    ffgidm(fptr, &naxis, status);
    ffgisz(fptr, 9, naxes, status);

    if (datatype == TBYTE)
    {
        ffpssb(fptr, 1, naxis, naxes, blc, trc,
               (unsigned char *) array, status);
    }
    else if (datatype == TSBYTE)
    {
        ffpsssb(fptr, 1, naxis, naxes, blc, trc,
               (signed char *) array, status);
    }
    else if (datatype == TUSHORT)
    {
        ffpssui(fptr, 1, naxis, naxes, blc, trc,
               (unsigned short *) array, status);
    }
    else if (datatype == TSHORT)
    {
        ffpssi(fptr, 1, naxis, naxes, blc, trc,
               (short *) array, status);
    }
    else if (datatype == TUINT)
    {
        ffpssuk(fptr, 1, naxis, naxes, blc, trc,
               (unsigned int *) array, status);
    }
    else if (datatype == TINT)
    {
        ffpssk(fptr, 1, naxis, naxes, blc, trc,
               (int *) array, status);
    }
    else if (datatype == TULONG)
    {
        ffpssuj(fptr, 1, naxis, naxes, blc, trc,
               (unsigned long *) array, status);
    }
    else if (datatype == TLONG)
    {
        ffpssj(fptr, 1, naxis, naxes, blc, trc,
               (long *) array, status);
    }
    else if (datatype == TULONGLONG)
    {
        ffpssujj(fptr, 1, naxis, naxes, blc, trc,
               (ULONGLONG *) array, status);
    }
    else if (datatype == TLONGLONG)
    {
        ffpssjj(fptr, 1, naxis, naxes, blc, trc,
               (LONGLONG *) array, status);
    }    
    else if (datatype == TFLOAT)
    {
        ffpsse(fptr, 1, naxis, naxes, blc, trc,
               (float *) array, status);
    }
    else if (datatype == TDOUBLE)
    {
        ffpssd(fptr, 1, naxis, naxes, blc, trc,
               (double *) array, status);
    }
    else
      *status = BAD_DATATYPE;

    return(*status);
}

and then this can be used like this example:

  img=np.arange(10*10,dtype='i8').reshape(10,10) 
  fitsf = FITS('test.fits','rw') 
  fitsf.write(img)
  img2 =np.arange(3*3,dtype='i8').reshape(3,3)
  fitsf[0].write(img2, blc0=5, blc1 = 5, trc0 = 7, trc1 = 7)
  print (fitsf[0]) 
  fitsf.close()

Apologies for it being hacky - my first foray into C!

beckermr commented 4 years ago

Should we be adding this patch to the main source code?

PhilippaHartley commented 4 years ago

I think it would be good to get the expand_if_needed functionality working first (since the reason for writing rectangles to file would likely usually be to avoid keeping a large numpy array in memory - so the expand function is really nice since it means you don't have to create the huge array in memory. At least I hope that's how it works!), so it might be best to wait for that? I was going to try to get it implemented, so I could send it all over if I do.

So, maybe keep it open for now, thanks

esheldon commented 3 years ago

@PhilippaHartley and @beckermr Where do we stand on this one ?

beckermr commented 3 years ago

I'm not sure. I cannot tell if we need simply update the wrapper or if we need changes to the C code. It seems like if we could convert the smaller image to a set of 1d ranges, then we could handle this in python by calling the write function we have more than once.

PhilippaHartley commented 3 years ago

I did get the expand_if_needed functionality working for 3d data. I can't remember how now but I have the source code if you would like to compare versions? I can look into this more but not for a few weeks.

esheldon commented 3 years ago

cfitsio does expand if needed, by default. Is this something special for the cases you are thinking of?

PhilippaHartley commented 3 years ago

I think it was that I needed to implement expand_if_needed inside this subset-overwriting feature.

PhilippaHartley commented 3 years ago

Maybe I had just commented it out while working on the subset writing bit. but it's back in now in my version.