Open gigjozsa opened 6 years ago
More specifically, for each visibility spectrum in the data (i.e., each time interval and baseline) we must fit a polynomial to Re(XX), Im(XX), Re(YY), Im(YY), and then subtract it.
I suggest to write the code in such a way that it can be run both on visibilities (from .MS) and on image cubes (from .FITS). In the latter case, for each (x,y) on the sky we fit a polynomial to the spectrum along that line of sight.
I would start it simple and not worry about bandwidth. To begin with, I would also not worry about outlier rejection (which we discussed today). The crucial bit is the parallelisation.
Once we have the basic code in place we can make the fitting more sophisticated and include outlier rejection.
Just wondering: is a subtraction of Stokes I potentially more stable? Calculate Stokes I, then make a polynomial fit to Re and Im?
So, @o-smirnov , the code would be something like:
for each row in DATA column of MS: fit and subtract poly to Re(XX) fit and subtract poly to Im(XX) fit and subtract poly to Re(YY) fit and subtract poly to Im(YY) write continuum-subtracted MS
The goal is to be able to process rows in parallel. The order of the polynomial should be an input parameter.
@gigjozsa If you were to fit stokes I and subtract the result from XX and YY you'd be assuming zero polarization. I don't know whether that's a good thing to do. Let's keep it to the standard, well tested approach for the moment, and experiment with your idea once the basic code is in place.
Also implement a running median filter.
@gigjozsa it seems that the IMLIN part of this issue is now available as a stand-alone Python script at https://github.com/gigjozsa/jscripts/blob/main/jscripts/image_contsub.py . Is there any reason not to implement this in CARACal?
As far as I can see all dependencies of this script are in CARACal already. So this could be included in CARACal by simply copy-pasting (more or less) the lines of code into the line worker.
What do you think?
@paoloserra yes, my sincere apologies for not having done so yet. I was thinking on the lines of having a central repository for scripts. But for time being copy-pasting is also an option. Notice the Savitzky-Golay option. It's a median filter only with a polynomial instead of a constant.
I was impressed by the results of the median filter (@mpatir showed it to me yesterday). I haven't see the rest but would love to play with it.
Do you have time to include it as a new feature of the line worker soon?
Savitzky-Golay's even better if you're careful. I'll include the option by the end of June.
median filter is a 2-liner, but I would tend to also include polynomial and Savitzky-Golay as an option.
Ideally all the options available in your current script should be there :)
For each integration interval, perform a polynomial fit with variable order to xx and yy in frequency, then subtract. Main issue is parallelization.
Potential addenda:
Challenge: Not really feasible for wideband. Potential solution: Fit polynomial coefficients at nodes x MHz apart, then for each frequency interpolate polynomial coefficients linearly.