caracal-pipeline / caracal

Containerized Automated Radio Astronomy Calibration (CARACal) pipeline
GNU General Public License v2.0
29 stars 6 forks source link

Implement a home-made uvlin/imlin #141

Open gigjozsa opened 6 years ago

gigjozsa commented 6 years ago

For each integration interval, perform a polynomial fit with variable order to xx and yy in frequency, then subtract. Main issue is parallelization.

Potential addenda:

Challenge: Not really feasible for wideband. Potential solution: Fit polynomial coefficients at nodes x MHz apart, then for each frequency interpolate polynomial coefficients linearly.

paoloserra commented 6 years ago

More specifically, for each visibility spectrum in the data (i.e., each time interval and baseline) we must fit a polynomial to Re(XX), Im(XX), Re(YY), Im(YY), and then subtract it.

I suggest to write the code in such a way that it can be run both on visibilities (from .MS) and on image cubes (from .FITS). In the latter case, for each (x,y) on the sky we fit a polynomial to the spectrum along that line of sight.

I would start it simple and not worry about bandwidth. To begin with, I would also not worry about outlier rejection (which we discussed today). The crucial bit is the parallelisation.

Once we have the basic code in place we can make the fitting more sophisticated and include outlier rejection.

gigjozsa commented 6 years ago

Just wondering: is a subtraction of Stokes I potentially more stable? Calculate Stokes I, then make a polynomial fit to Re and Im?

paoloserra commented 6 years ago

So, @o-smirnov , the code would be something like:

for each row in DATA column of MS: fit and subtract poly to Re(XX) fit and subtract poly to Im(XX) fit and subtract poly to Re(YY) fit and subtract poly to Im(YY) write continuum-subtracted MS

The goal is to be able to process rows in parallel. The order of the polynomial should be an input parameter.

@gigjozsa If you were to fit stokes I and subtract the result from XX and YY you'd be assuming zero polarization. I don't know whether that's a good thing to do. Let's keep it to the standard, well tested approach for the moment, and experiment with your idea once the basic code is in place.

gigjozsa commented 4 years ago

Also implement a running median filter.

paoloserra commented 3 years ago

@gigjozsa it seems that the IMLIN part of this issue is now available as a stand-alone Python script at https://github.com/gigjozsa/jscripts/blob/main/jscripts/image_contsub.py . Is there any reason not to implement this in CARACal?

As far as I can see all dependencies of this script are in CARACal already. So this could be included in CARACal by simply copy-pasting (more or less) the lines of code into the line worker.

What do you think?

gigjozsa commented 3 years ago

@paoloserra yes, my sincere apologies for not having done so yet. I was thinking on the lines of having a central repository for scripts. But for time being copy-pasting is also an option. Notice the Savitzky-Golay option. It's a median filter only with a polynomial instead of a constant.

paoloserra commented 3 years ago

I was impressed by the results of the median filter (@mpatir showed it to me yesterday). I haven't see the rest but would love to play with it.

Do you have time to include it as a new feature of the line worker soon?

gigjozsa commented 3 years ago

Savitzky-Golay's even better if you're careful. I'll include the option by the end of June.

gigjozsa commented 3 years ago

median filter is a 2-liner, but I would tend to also include polynomial and Savitzky-Golay as an option.

paoloserra commented 3 years ago

Ideally all the options available in your current script should be there :)