Open cyriltasse opened 1 year ago
We need to do tests on one of the nodes the data would be reduced on, but if DynSpecMS turns out to be too slow, I can probably accelerate it...
Impressive. If I can help with the input catalog or anything...
Nice Cyril. What is the radius of the large green circle? 2.2 deg? We might want to make it a little larger since some stars are detected ~3 deg from the pointing centre but will not have their dynamic spectrum produced.
As I said in email, I will get you a modified target list and proper motion code tonight/tomorrow
Yeah - 2 degrees that's what we have previously use. We could go to 3 but computing might be x2-3. Let's see what we can afford after we have tested and quantified computing time?
Hey,
On the left panel is the right hand image of the above and when you apply to Jones clipping, you get the right panel image bellow
It looks pretty good, it could really improve things...
Promising...
This is looking fantastic Cyril! This kind of "scruff" really limits some analysis and was producing false positives. However, I still dont understand what produces the "scruff" at the top right then? If the Jones clipping got rid of the stuff in the middle, why not that? Maybe it is lower SNR?
However, I still dont understand what produces the "scruff" at the top right then? If the Jones clipping got rid of the stuff in the middle, why not that? Maybe it is lower SNR?
Good question, I don't really know. But clipping don't mean you're left with good solutions. Just less extreme. And there is also the visibilities themselves that can go crazy (we have applied the DI Jones as the inverse (as the whole rest of the world) - and not as the Hermitian transpose, so low amplitudes blow up the noise). And there is the facts that some time-freq has less (flagged) visibilities stacked - therefore more noisy average. In my process, this is taken into account, I reduce the spectra taking the weights into account.
It will be important to test on other fields before we decide whether we want to use Jones clipping (but I think we will want to). I want to test on fields that has very bad facets for instance.
I'm now editing this reprocessing script as suggested by Tim. I hope we can run a first test with a few tens of fields by next week. I'd like to launch the reprocessing before christmass.
Hey mate,
Working on it. Probably will not get it to you today but will definitely get it to you Monday afternoon.
@jrcallingham would be cool - then the first batch of tests can be run using the pm correction.
Hey Cyril - as I said in the general email to the group, here is the proper motion code. I dont want to implement it directly as you know you code best but I have assumed you have already read in the table at some point (see comments etc) ``
# Proper motion correction
from astropy.io import fits
from astropy.coordinates import SkyCoord, Distance
from astropy.time import Time
import astropy.units as u
# Assuming two things, you have read in the table and made the vairables ra, dec, pmra etc, and you the date of the observation (date_obs) in YYYY-MM-DD format.
# In terms of a pipeline I would have an if statement here that if pmra is a real value, then do the following,
# otherwise just extract at the reported ra,dec position
c = SkyCoord(ra=ra * u.deg,
dec=dec * u.deg,
distance=Distance(parallax=parallax * u.mas,allow_negative=True),
pm_ra_cosdec=pmra * u.mas/u.yr,
pm_dec=pmdec * u.mas/u.yr,
obstime=Time(ref_epoch, format='decimalyear'))
c_lotss = SkyCoord(ra=ra_lotss * u.deg,
dec=dec_lotss * u.deg,
obstime=Time(date_obs, format='iso'))
epoch_lotss = Time(date_obs, format='iso')
print('Performing proper motion corrections for '+field)
c_gaia_to_lotss_epoch = c.apply_space_motion(epoch_lotss)
# Then to extract the correct ra and dec just do it at c_gaia_to_lotss_epoch.ra and c_gaia_to_lotss_epoch.dec```
Note that ra_lotss and dec_lotss are just the centre of the field. You can probably not set them too
@jrcallingham Cool - thanks!
@mhardcastle @tim
I see some weird things happening with DDF/kMS reprocessing the data... I'm working with P236+53/L470106
Looks like the DicoModel was made with "--Beam-At facet", but when I do that the resid look ugly, si I tried using "--Beam-At tessel" and now they look good. Also the option is not present in the kMS parset at the time (was tessel by default)... DDFacet log says version "dev-c563ecb". I tried to git checkout to investigate but it doesn't match with a tag/hash.
Do you have any clues what's going on? How I could dig in the issue?
perhaps remove the -dev bit. I see from 11th June 2019 - https://github.com/cyriltasse/DDFacet/commit/c563ecb0eafbf16d18b74057f8c040a43bc75136
Yeah, perfect that works! Great inspiration !!! thanks!
Ok I think I found the issue. This is for the version that was used in the pointing I'm doing my tests on. The --Beam-At facet
actually triggers the same as tessel
:
https://github.com/cyriltasse/DDFacet/blob/c563ecb0eafbf16d18b74057f8c040a43bc75136/DDFacet/Data/ClassJones.py#L190
I was fine because the beam was computed at the tessel
by default in the kMS used at the time
But since the bug has been fixed since this we now have to specify tessel
.
Now the problem is that we should be able to know what DicoModel has been processed using what beam mesh in both kMS/DDF.
@mhardcastle @twshimwell I there a simple way to get the version of DDF/kMS that have been used for the processing? With a little chance there's just a few of them and it would be easy to figure out which beam mesh we should use to compute the residuals?
I could automatically scan the logs and build a database of software versions... Maybe something already exists?
For each archived dataset there should be a summary.txt file in misc.tar that has the version numbers of all the software. But it's not in the database, unfortunately...
Fantastic... I'll batch read this!
Ok - I'm downloading all the misc.tar
and extracting summaries... It might take a few days before I get a representative sample and an understanding of the various DDF/kMS versions that have been used
A few days because from my rough calculation the getting the misc.tar takes a while, and to download all of them will take 6 days. I understood they are stored on tape, so the copy is pretty slow...
Hi this is a first at the various version of ddf-pipeline, DDFacet, kMS, and DynSpecMS that have been used:
As far as I've analysed only the version of DDFacet 0.6.0 if computing the beam with the actual --Beam-At facet
option. The others versions either don't have the option or have the option but is bugged (and actually do --Beam-At tessel
).
This is very cool to see. The solid lines in the last ~ 1 yr correspond to me using a static singularity image for processing, post-DR2. All DR2 data will have been processed with older DDF versions.
4 years of processing... :)
@twshimwell @mhardcastle this probably means that the reprocessing script that has been used with the more recent version of DDFacet (version>0.6.0) on DicoModel that have been produced with older version (version<0.6.0) will give wrong results. This is a comparison of how wrong the results are in the image plane (left is with --Beam-At facet
vs --Beam-At tessel
on the right)
and in the dyns at the location of the blue circle
So quite critical to actually take this into account...
This probably mostly affects the source subtraction that's done in the reprocessing scripts. But also will affect us potentially if we try to re-image non-DR2 data... we can probably pretty easily build in a check for the original DDFacet version.
we can probably pretty easily build in a check for the original DDFacet version.
yes - for the DynSpecMS
reprocessing I'm editing reprocessing_utils.py
and I will add a version check there and modify the call to DDF.py ... --Beam-At
accordingly...
@twshimwell ok it seems I have a working version, probably you can have a go on just one or two pointings?
My current branches are:
DynSpecMS: MTilde_ASync_weights
ddf-pipeline: reprocDynSpec
@mhardcastle @twshimwell it's weird the run_full_field_reprocessing_pipeline
that I've modified returns me RuntimeError('Requested field is not in database')
in like 50% of the fields here, i see it's testing based on this
Are you sure that checking at full_field_reprocessing
is correct?
I'll try to accelerate a bit DynSpecMS
, and include @jrcallingham code for pm correction
Hey Cyril - as I said in the general email to the group, here is the proper motion code. I dont want to implement it directly as you know you code best but I have assumed you have already read in the table at some point (see comments etc)
@jrcallingham I was reading your code, and I can see also in your comment I need to read a table to get the pmra pmdec. Not sure what you mean could you include and test that in your code?
The current code is downloading, doing the predict with the right --Beam-At
depending on DDF version number of the DicoModel, using Jones clipping, saving the facet/tessel information in the fits header, doing 3 random per facet
Database now exists for all fields.
We do need to create a new column on the database though. @mhardcastle is it possible to add a column called "DynSpecMS"? to the full_field_reprocessing database?
Ah sorry I could do that. So now the full_field_reprocessing has a DynSpecMS column that we can populate.
Ok I've got a working singularity image ! Was quite an adventure... I'm now in touch with @mhardcastle and @twshimwell to start running, will let you know
@jrcallingham I was reading your code, and I can see also in your comment I need to read a table to get the pmra pmdec. Not sure what you mean could you include and test that in your code?
Sorry Cyril. Just saw this. I just mean you have to read in the DynSpec catalogue table I sent. I don’t know how you are doing that in the pipeline, but you can read it into variable names via astropy.io
The table you sent on monday 16:12 CET? There has been discussions with Philippe later, is everyone happy with this target list?
It’s the updated table I sent on Wednesday which had some additions from Philippe (can find here https://www.dropbox.com/s/vgv30hjd1zzi65c/dyn_spec_catalogue_addedexo.fits?dl=0)
@jrcallingham thanks I'm reading the fits table and I see 30 NaNs in ra/dec and 23000 in pmra/pmdec. Is that all normal/understood?
Nope. That does not sound normal. I will investigate in an hour
Also in the catalog I only see a DESIGNATION
with the name of the object - but the former catalog also had a Type
key (exoplanet, gaia, brown dwarf, etc) tracing the source / selection criteria - could you add this if you have the time?
Might be hard to add now. Still in meetings. Will see what I can do.
Hi Cyril,
The ~30 NANs in ra/dec where due to a bad/rushed concatenation of V-LoTSS. Fixing that now and adding a Type column. However, I do not see the 23000 in pmra/pmdec? I see pmra/pmdec of ~3000-10000 mas/yr (keep in mind this is milliarcsec).
Hi Cyril,
The most up-to-date version of the catalogue can be downloaded here: https://www.dropbox.com/s/y9yx9rnqs44yqiv/dyn_spec_catalogue_addedexo_addvlotss.fits?dl=0. Main difference with previous catalogue is that I fixed the NAN ra/dec for the 30 odd sources and added a type column. Also, this time I only removed duplicates if they were in a stellar catalogue (e.g. Gaia and Active stars). If the source is in exoplanet and Gaia I kept both entries (to be safe and clear where the extraction is occuring).
Ok cool, thanks
However, I do not see the 23000 in pmra/pmdec? I see pmra/pmdec of ~3000-10000 mas/yr (keep in mind this is milliarcsec).
I was saying that the field pmra
/ pmdec
has many NaN:
In [8]: np.count_nonzero(np.isnan(c["pmra"]))
Out[8]: 23563
(the new catalog still has NaN for these fields)
Yes - that is fine. Not everything has a measured proper motion (e.g. ultracooldwarfs that are too faint in Gaia or Active stars that do not have it reported. I have tried to supplement with Gaia where possible but was not possible for all).
Ok cool thanks!
We've just talked with @twshimwell and @mhardcastle about the details of how we do things in practice on the herts cluster. We'll let you know when we have something running...
Hey Cyril - as I said in the general email to the group, here is the proper motion code. I dont want to implement it directly as you know you code best but I have assumed you have already read in the table at some point (see comments etc) ``
# Proper motion correction from astropy.io import fits from astropy.coordinates import SkyCoord, Distance from astropy.time import Time import astropy.units as u # Assuming two things, you have read in the table and made the vairables ra, dec, pmra etc, and you the date of the observation (date_obs) in YYYY-MM-DD format. # In terms of a pipeline I would have an if statement here that if pmra is a real value, then do the following, # otherwise just extract at the reported ra,dec position c = SkyCoord(ra=ra * u.deg, dec=dec * u.deg, distance=Distance(parallax=parallax * u.mas,allow_negative=True), pm_ra_cosdec=pmra * u.mas/u.yr, pm_dec=pmdec * u.mas/u.yr, obstime=Time(ref_epoch, format='decimalyear')) c_lotss = SkyCoord(ra=ra_lotss * u.deg, dec=dec_lotss * u.deg, obstime=Time(date_obs, format='iso')) epoch_lotss = Time(date_obs, format='iso') print('Performing proper motion corrections for '+field) c_gaia_to_lotss_epoch = c.apply_space_motion(epoch_lotss) # Then to extract the correct ra and dec just do it at c_gaia_to_lotss_epoch.ra and c_gaia_to_lotss_epoch.dec```
@jrcallingham in your code ref_epoch
and date_obs
are the same thing?
I'm stupid - sorry, I didn't see this was in the table...
For info, these are the pm correction for the objects in the catalog (log scale)
Some targets really move! :)
Hi-
Ok - so I've been implementing various changed to DynSpecMS. Branch:
MTilde_ASync
--nMinOffPerFacet
(default 5))The two first ones will highly improve the statistics robustness, and potential to detect and reject stuff. The last step should improve the results to get better spectra. In the few tests I did I see some improvement but nothing dramatic.
This is what the off/on (blue/green) now looks like. Only facets covered by the search radius and concerned with offs beeing scattered accross them.
This is the comparison between before (left) and now (right):
Noise is lower by 2% in the newer version with lower artifacts (in a few places it's higher, I'll experiment quickly with clipping the data with the high gains, but I think it's not too bad given we now will have good off statistics).
Joe - time to add you proper motion correction? Give me a module and I can update the code
The command line I've used was:
ms2dynspec.py --ms big-mslist.txt --data DATA --model DDF_PREDICT --sols [DDS3_full_smoothed,DDS3_full_slow_merged] --rad 2. --SolsDir SOLSDIR --BeamModel LOFAR --BeamNBand 1 --DicoFacet image_full_ampphase_di_m.NS3.DicoFacet --noff 100
The facet layout is given by
--DicoFacet
argument