kevin218 / Eureka

Eureka! is a data reduction and analysis pipeline intended for time-series observations with JWST.
https://eurekadocs.readthedocs.io/
MIT License
60 stars 47 forks source link

Add Fig 3106, 3304, 3401 to docs/media + minor changes #450

Closed kevin218 closed 2 years ago

kevin218 commented 2 years ago

In addition to adding/replacing figures in docs/median, this PR includes minor updates based on feedback from #441.

taylorbell57 commented 2 years ago

I'm working on Sebastian's PR right now so it'll be a little bit (~1 day maybe) before I can review this, but some other things I've noticed while trying to use the new clean_median_flux function:

  1. I want to be able to set in the median_thresh parameter in the ECF
  2. I want to be able to try without smoothing, so I suggest skipping the smoothing related steps if window_len <= 1 or window_len is None
  3. I find it is important to smooth over masked pixels from the DQ array before smoothing and interpolating because some part of the smoothing/interpolating code does not appear to be aware of masked arrays and ignores the mask.

The local version of the code that I have come up with so far is:

def clean_median_flux(data, meta, log):
    """Computes a median flux frame that is free of bad pixels.

    Parameters
    ----------
    data : Xarray Dataset
        The Dataset object.
    meta : eureka.lib.readECF.MetaClass
        The metadata object.
    log : logedit.Logedit
        The current log.

    Returns
    -------
    data : Xarray Dataset
        The updated Dataset object.

    Notes
    -----
    History:

    - 2022-08-03, KBS
        Inital version
    """
    log.writelog('  Computing clean median frame...', mute=(not meta.verbose))

    # Create mask using all data quality flags
    mask = np.copy(data.mask.values)
    mask[data.dq.values > 0] = 0

    # Compute median flux using masked arrays
    flux_ma = np.ma.masked_where(mask == 0, data.flux.values)
    medflux = np.ma.median(flux_ma, axis=0)
    ny, nx = medflux.shape

    # Interpolate over masked pixels
    clean_med = np.zeros((ny, nx))
    xx = np.arange(nx)
    for j in range(ny):
        x1 = xx[~medflux.mask[j]]
        goodmed = medflux[j][~medflux.mask[j]]
        f = spi.interp1d(x1, goodmed, 'linear', fill_value='extrapolate')
        # f = spi.UnivariateSpline(x1, goodmed, k=1, s=None)
        clean_med[j] = f(xx)
    medflux = np.ma.masked_array(clean_med)

    # Apply smoothing filter
    if meta.window_len > 1:
        smoothflux = np.zeros((ny, nx))
        for j in range(ny):
            smoothflux[j] = smooth.medfilt(medflux.data[j], meta.window_len)

        # Compute residuals
        residuals = medflux - smoothflux

        # Flag outliers
        if not hasattr(meta, 'median_thresh'):
            meta.median_thresh = 5
        outliers = sigma_clip(residuals, sigma=meta.median_thresh, maxiters=5,
                              axis=1, cenfunc=np.ma.median, 
                              stdfunc=np.ma.std).mask

        # Interpolate over bad pixels
        clean_med = np.zeros((ny, nx))
        xx = np.arange(nx)
        for j in range(ny):
            x1 = xx[~outliers[j]]
            goodmed = medflux[j][~outliers[j]]
            f = spi.interp1d(x1, goodmed, 'linear', fill_value='extrapolate')
            # f = spi.UnivariateSpline(x1, goodmed, k=1, s=None)
            clean_med[j] = f(xx)

    data['medflux'] = (['y', 'x'], clean_med)
    data['medflux'].attrs['flux_units'] = data.flux.attrs['flux_units']

    if meta.isplots_S3 >= 4:
        plots_s3.median_frame(data, meta)

    return data
codecov-commenter commented 2 years ago

Codecov Report

Merging #450 (45eb96d) into main (ee098d0) will increase coverage by 0.03%. The diff coverage is 97.22%.

@@            Coverage Diff             @@
##             main     #450      +/-   ##
==========================================
+ Coverage   59.91%   59.94%   +0.03%     
==========================================
  Files          75       75              
  Lines        8218     8227       +9     
==========================================
+ Hits         4924     4932       +8     
- Misses       3294     3295       +1     
Impacted Files Coverage Δ
src/eureka/S3_data_reduction/optspex.py 46.49% <95.45%> (+0.71%) :arrow_up:
src/eureka/S3_data_reduction/plots_s3.py 73.78% <100.00%> (+0.16%) :arrow_up:
src/eureka/S3_data_reduction/s3_reduce.py 93.10% <100.00%> (ø)
src/eureka/S3_data_reduction/straighten.py 100.00% <100.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

kevin218 commented 2 years ago

@taylorbell57 I think I've got it working with a new median filter via scipy.ndimage. Thanks for the great suggestions!