caracal-pipeline / caracal

Containerized Automated Radio Astronomy Calibration (CARACal) pipeline
GNU General Public License v2.0
29 stars 6 forks source link

Step converting PB models to casa models should use log10 not log #1354

Open bennahugo opened 3 years ago

bennahugo commented 3 years ago

@landmanbester raised this one with me Ok - blaming this one on poor choice of documentation in setjy: It says

example:  fluxdensity=[2.63,0.21,-0.33,0.02]  
          will put in I,Q,U,V flux densities of 2.63,0.21,-0.33,  
  and 0.02, respectively, in the model column.  

    spix -- Spectral index for I flux density (a float or a list of float values):  
       where S = fluxdensity * (freq/reffreq)**(spix[0]+spix[1]*log(freq/reffreq)+..)  

where they mean log10 not log!

Consequently any codepath options using https://github.com/caracal-pipeline/caracal/blob/master/caracal/dispatch_crew/catalog_parser.py#L203 will return spectra that are way off - I think the ancient point source coefficient model for PKS B 0407 will be affected, but not that returned in the lsm or the wsclean crystalball catalogs I provided to the observatory (which I think @KshitijT now uses in the pipeline right?)

Any case please take note - your flux scales may be way off what they should be in that codepath!

KshitijT commented 3 years ago

Thanks for the warning @bennahugo and @landmanbester; I'll check this out. I haven't yet added the UHF models to master - but that's anyway not going to work for L band, right? The default for MeerKAT is indeed the lsm for now from what I remember offhand.

bennahugo commented 3 years ago

This is just 1934 which I quickly plotted the coefficients from the reynolds 94 paper for: image and this is what you get if you convert with the code as it stands (just pulled the method out for ease of comparison with a small sim'd db): image

bennahugo commented 3 years ago

ok @KshitijT just note those lsms are old they have to be replaced with the crystalball models I provided you - the spectra off axis are not great in those models.

The crystalball models are already in use in the sdp pipeline and the reductions I do offline.

KshitijT commented 3 years ago

ok @KshitijT just note those lsms are old they have to be replaced with the crystalball models I provided you - the spectra off axis are not great in those models.

Cool, I'll try modifying the crosscal worker (that nest of vipers) accordingly.

bennahugo commented 3 years ago

Also note the meqtrees convention is log (ie. np.ln) not log10 so there is a slight difference between that and casa setjy. Fuck all these small differences in convention is maddening

KshitijT commented 3 years ago

Also note the meqtrees convention is log (ie. np.ln) not log10 so there is a slight difference between that and casa setjy. Fuck all these small differences in convention is maddening

Well but the plan is to put those lsms in the dust bin, right? So meqtrees convention shouldn't matter?

bennahugo commented 3 years ago

yup please do - just noted here for reference

o-smirnov commented 3 years ago

Hmmm, this one was pretty nasty, eh? So anyone that wasn't using meerkat_skymodels: true in Caracal until a month ago (i.e. me) was using the blue bandpass model instead of the orange one. In-band spectral indices, you say? :man_facepalming:

image

bennahugo commented 3 years ago

Yup.... please use the wsclean component catalogs - not the lsms - these are the ones used in the SP pipeline. Wonder how many papers have an incorrect flux scale and inband SPI as a result???

On Tue, Sep 7, 2021 at 4:33 PM Oleg Smirnov @.***> wrote:

Hmmm, this one was pretty nasty, eh? So anyone that wasn't using meerkat_skymodels: true in Caracal until a month ago (i.e. me) was using the blue bandpass model instead of the orange one. In-band spectral indices, you say? 🤦‍♂️

[image: image] https://user-images.githubusercontent.com/6470079/132362643-4bd77e86-31c4-439e-9666-c4e526a6a29e.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/caracal-pipeline/caracal/issues/1354#issuecomment-914357749, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RE6QIG6ME2JWIN6SG3UTUAYPEDANCNFSM47VGEGAA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

--

Benjamin Hugo

PhD. student, Centre for Radio Astronomy Techniques and Technologies Department of Physics and Electronics Rhodes University

Junior software developer Radio Astronomy Research Group South African Radio Astronomy Observatory Black River Business Park Observatory Cape Town

o-smirnov commented 3 years ago

All in-band SPIs are a deperate bluff. Some are just more of a bluff than others.

landmanbester commented 3 years ago

Hold on, something is not right here. If I use the parameters reported by setJy with the natural log I get something that resembles Oleg's orange curve (or what Ben called Reynolds 94 above). If I use it with log10 I get something closer to the blue curve. CASA setJy seems to populate MODEL_DATA with the latter but, if I understand the discussion correctly, we should be using the former. So, am I to conclude that the issue has a confusing name and we should never have trusted CASA to start with? Eyeballing Ben's spectra for B1934 here seems to suggest so. To me it seems that, while the CASA docs are correctly stating that we should use the natural log, they seem to be using log10 for some reason. Is this correct? For reference, the script I used to arrive at this conclusion

import numpy as np

source_dict = {}
source_dict['J1939_6342'] = {}
source_dict['J1939_6342']['I0'] = 14.907688923070978
source_dict['J1939_6342']['a'] = -0.19473726758824528
source_dict['J1939_6342']['b'] = -0.6012047948505569
source_dict['J1939_6342']['c'] = 0.11417312472848717
source_dict['J1939_6342']['d'] = 9.069178705870204e-08
source_dict['J1939_6342']['nu0'] = 1.4e9

def freq_profile(nu, src):
    """
    Returns spectrum of a point source parametrised as

    I(nu) = I(nu0) (nu/nu0) ** (a + b * log10(nu/nu0)) +
             c * log10(nu/nu0)**2 + d * log10(nu/nu0)**3)

    for v specified in Hz.
    """
    w = nu/src['nu0']
    logw = np.log10(w)
    expon = (src['a'] +
             src['b'] * logw +
             src['c'] * logw**2 +
             src['d'] * logw**3)
    return src['I0'] * w **(expon)

if __name__=="__main__":
    freq = np.linspace(0.85e9, 1.75e9, 1000)
    J1939_6342_profile = freq_profile(freq, source_dict['J1939_6342'])

    import matplotlib.pyplot as plt
    plt.plot(freq, J1939_6342_profile)
    plt.show()
landmanbester commented 3 years ago

Ah, I might be getting slightly less confused. I took those coefficients out of the caracal log i.e.

# 2021-06-30 20:56:41   INFO    setjy::::
# 2021-06-30 20:56:41   INFO    setjy::::+      ##########################################
# 2021-06-30 20:56:41   INFO    setjy::::+      ##### Begin Task: setjy              #####
# 2021-06-30 20:56:41   INFO    setjy::::       setjy(vis="/stimela_mount/msdir/1557433849_sdp_l0-cal.ms",field="J1939-6342",spw="",selectdata=True,timerange="",
# 2021-06-30 20:56:41   INFO    setjy::::+              scan="",intent="",observation="",scalebychan=True,standard="manual",
# 2021-06-30 20:56:41   INFO    setjy::::+              model="",modimage="",listmodels=False,fluxdensity=[14.907688923070978],spix=[-0.19473726758824528, -0.6012047948505569, 0.11417312472848717, 9.069178705870204e-08],
# 2021-06-30 20:56:41   INFO    setjy::::+              reffreq="1.400000GHz",polindex=[],polangle=[],rotmeas=0.0,fluxdict={},
# 2021-06-30 20:56:41   INFO    setjy::::+              useephemdir=False,interpolation="linear",usescratch=True,ismms=False)
# 2021-06-30 20:56:41   INFO    setjy::::       {'field': 'J1939-6342'}
# 2021-06-30 20:56:41   INFO    setjy::::       CASA Version 5.6.1-8
# 2021-06-30 20:56:41   INFO    setjy::::
# 2021-06-30 20:56:41   INFO    imager::setjy() Using channel dependent flux densities
# 2021-06-30 20:56:42   INFO    imager::createSkyEquation()     Processing after subtracting componentlist /stimela_mount/msdir/1557433849_sdp_l0-cal.ms_setjy_spw0_J1939-6342_0.856GHz58612.9d.cl
# 2021-06-30 21:04:32   INFO    imager::setjy() Flux density as a function of frequency (channel 0 of each spw):
# 2021-06-30 21:04:32   INFO    imager::setjy()+          Frequency (GHz)    Flux Density (Jy, Stokes I)
# 2021-06-30 21:04:32   INFO    imager::setjy()      0.856         15.3624
# 2021-06-30 21:04:32   INFO    setjy::::       ##### End Task: setjy                #####
# 2021-06-30 21:04:32   INFO    setjy::::+      ##########################################
# Running CASA task 'setjy'

If you run that through setjy you get the blue curve but I didn't realise I was setting the flux scale in manual mode (I assumed I was using one of the many standards CASA knows about). So, to summarise, casa does use log10 and the manually provided coefficients are wrong

o-smirnov commented 2 years ago

Just had @SinahManaka trip over this one again. We are still using incorrect calibrator models when setjy() is used instead of the model predict. :(