A module to perform transient searching between SUMSS/NVSS and RACS. Creates a crossmatch catalogue for SUMSS/NVSS -> ASKAP sources, produces diagnostic plots and also includes the ability to create postage stamps of each crossmatch. A Django webserver is also included that will allow you to explore the results and mark candidates for further investigation.
This module was born out of an initial diagnostic script and has grown and grown. As the author, I can only describe the state as 'terrible'. So if you are looking to try and decipher this or use a bit of it I can only apologise, please contact me if you do find yourself here. At this point I would scrap most of it and start again, I had also not quite let go of traditional for loops when using pandas so the performance could also be improved. It's a similar story for the web_server
- while it works there is a bit of duplication in the code and some hacky ways around things to make it do what I wanted. At least I made it to Python 3, so there's that.
Module is for Python 3 only (I have tested up to v3.8.1).
The only non-python module dependancy (everything else should be installed from pip install) required is postgres and the postgres plugin Q3C: https://github.com/segasai/q3c. However these are only required for the website. You also only need to install Q3C to postgres as the Django migration will do the Q3C setting up for you.
CASA is required if you wish the pipeline to produce the convolved image.
On first use django-keyboard-shortcuts will fail with Python 3, though the fix is quite simple. It is a rogue comma in the __init__.py
. Edit this as mentioned in this Stack Overflow post and it will work.
Currently you also need a local copy of the SUMSS and NVSS mosaic images. I never got around to pulling a large image directly from SkyView, but through testing this is probably possible.
I recommend to install the module to a new python environment using, for example, conda or virtualenv.
To install using pip:
pip install git+https://github.com/ajstewart/askap-image-diagnostic.git
Or you can clone the git repository and install using python setup.py install
or pip install .
.
This script was intended to be run on the ada machine which has a installation of postgresql available. To create a database run:
createdb <db name>
e.g. createdb racs
(if you get a denied message contact the system administrator)
This will create an empty database with the chosen name. Make sure to note down the database settings (port, user, name) for use with the pipeline options. If the pipeline is run, with the db inject option turned on without first initilising the tables, then the tables will be newly created. The easiest way to initilise the tables is by setting up the [website](#Installation of the Website).
Included in the repository is web_server
which is a basic website built using Django to allow the user to explore the results in a convienient way and for other users to give feedback on the crossmatching.
To install, copy the web_server
directory to a location where you wish to host the website from and cd
into the web_server directory.
From here rename the web_server/settings.py.template
to web_server/settings.py
and edit the file with the correct database information as above.
Now run the migrations as so, this will essentially create the tables in the database:
python manage.py makemigrations
python manage.py migrate
Also make a note of the install directory, specifically to the /static/media/
directory as this is used in the pipeline (option --website-media-dir
).
Now the server can be launched (in the example below port 8005 is used):
python manage.py runserver 0.0.0.0:8005
The website has the feature of being able to send message to slack. To set this up you need a Bot API token from Slack for your app and the ID of the channel to send it to. See here for more information.
See PIPELINE.md.
The built pipeline script, available from the command line, is processASKAPimage.py
.
By default, which means no askap or sumss csv files are provided, aegean
will be run on the ASKAP image to extract a source catalogue and the SUMSS catalogue will be automatically fetched from Vizier. The SUMSS catalogue will be trimmed to only those sources that fall within the image area.
More than one image can be passed through the processing script at once - however currently the manual csv inputs do not support multiple entires. Hence let the script automatically do source finding and SUMSS fetching if you want to run more than one image through.
A range of options exist to influence processing:
usage: processASKAPimage.py [-h] [-c FILE] [--output-tag OUTPUT_TAG] [--log-level {WARNING,INFO,DEBUG}]
[--nice NICE] [--clobber CLOBBER] [--sumss-only SUMSS_ONLY] [--nvss-only NVSS_ONLY]
[--weight-crop WEIGHT_CROP] [--weight-crop-value WEIGHT_CROP_VALUE]
[--weight-crop-image WEIGHT_CROP_IMAGE] [--convolve CONVOLVE]
[--convolved-image CONVOLVED_IMAGE]
[--convolved-non-conv-askap-csv CONVOLVED_NON_CONV_ASKAP_CSV]
[--convolved-non-conv-askap-islands-csv CONVOLVED_NON_CONV_ASKAP_ISLANDS_CSV]
[--sourcefinder {aegean,pybdsf,selavy}] [--frequency FREQUENCY] [--askap-csv ASKAP_CSV]
[--askap-islands-csv ASKAP_ISLANDS_CSV] [--sumss-csv SUMSS_CSV] [--nvss-csv NVSS_CSV]
[--askap-csv-format {aegean,selavy}] [--remove-extended REMOVE_EXTENDED]
[--askap-ext-thresh ASKAP_EXT_THRESH] [--sumss-ext-thresh SUMSS_EXT_THRESH]
[--nvss-ext-thresh NVSS_EXT_THRESH] [--use-all-fits USE_ALL_FITS]
[--write-ann WRITE_ANN] [--produce-overlays PRODUCE_OVERLAYS]
[--boundary-value {nan,zero}] [--askap-flux-error ASKAP_FLUX_ERROR]
[--diagnostic-max-separation DIAGNOSTIC_MAX_SEPARATION]
[--transient-max-separation TRANSIENT_MAX_SEPARATION] [--postage-stamps POSTAGE_STAMPS]
[--postage-stamp-selection {all,transients}]
[--postage-stamp-ncores POSTAGE_STAMP_NCORES]
[--postage-stamp-radius POSTAGE_STAMP_RADIUS]
[--postage-stamp-zscale-contrast POSTAGE_STAMP_ZSCALE_CONTRAST]
[--sumss-mosaic-dir SUMSS_MOSAIC_DIR] [--nvss-mosaic-dir NVSS_MOSAIC_DIR]
[--aegean-settings-config AEGEAN_SETTINGS_CONFIG]
[--pybdsf-settings-config PYBDSF_SETTINGS_CONFIG]
[--selavy-settings-config SELAVY_SETTINGS_CONFIG] [--transients TRANSIENTS]
[--transients-askap-snr-thresh TRANSIENTS_ASKAP_SNR_THRESH]
[--transients-large-flux-ratio-thresh TRANSIENTS_LARGE_FLUX_RATIO_THRESH]
[--db-inject DB_INJECT] [--db-engine DB_ENGINE] [--db-username DB_USERNAME]
[--db-password DB_PASSWORD] [--db-host DB_HOST] [--db-port DB_PORT]
[--db-database DB_DATABASE] [--db-tag DB_TAG] [--website-media-dir WEBSITE_MEDIA_DIR]
images [images ...]
positional arguments:
images Define the images to process
optional arguments:
-h, --help show this help message and exit
-c FILE, --conf_file FILE
Specify config file (default: None)
--output-tag OUTPUT_TAG
Add a tag to the output name. (default: )
--log-level {WARNING,INFO,DEBUG}
Set the logging level. (default: INFO)
--nice NICE Set the 'nice' level of processes. (default: 10)
--clobber CLOBBER Overwrite output if already exists. (default: False)
--sumss-only SUMSS_ONLY
Only use SUMSS in the image analysis. (default: False)
--nvss-only NVSS_ONLY
Only use NVSS in the image analysis. (default: False)
--weight-crop WEIGHT_CROP
Crop image using the weights image. (default: False)
--weight-crop-value WEIGHT_CROP_VALUE
Define the minimum normalised value from the weights image to crop to. (default: 0.04)
--weight-crop-image WEIGHT_CROP_IMAGE
Define the weights image to use. (default: weights.fits)
--convolve CONVOLVE Convolve the image using CASA to SUMSS resolution for crossmatching. (default: False)
--convolved-image CONVOLVED_IMAGE
Define a convolved image that has already been produced. (default: None)
--convolved-non-conv-askap-csv CONVOLVED_NON_CONV_ASKAP_CSV
Define the unconvolved catalogue to use when using convolved mode, otherwise it will be
generated automatically (if aegaen or pybdsf) (default: None)
--convolved-non-conv-askap-islands-csv CONVOLVED_NON_CONV_ASKAP_ISLANDS_CSV
Define the unconvolved island catalogue to use when using convolved mode, otherwise it will
be generated automatically (if aegaen or pybdsf) (default: None)
--sourcefinder {aegean,pybdsf,selavy}
Select which sourcefinder to use (default: aegean)
--frequency FREQUENCY
Provide the frequency of the image in Hz. Use if 'RESTFRQ' is not in the header (default:
99)
--askap-csv ASKAP_CSV
Manually define a aegean format csv file containing the extracted sources to use for the
ASKAP image. (default: None)
--askap-islands-csv ASKAP_ISLANDS_CSV
Manually define a csv file containing the extracted islands to use for the ASKAP image.
(default: None)
--sumss-csv SUMSS_CSV
Manually provide the SUMSS catalog csv. (default: None)
--nvss-csv NVSS_CSV Manually provide the NVSS catalog csv. (default: None)
--askap-csv-format {aegean,selavy}
Define which source finder provided the ASKAP catalog (currently only supports aegean).
(default: aegean)
--remove-extended REMOVE_EXTENDED
Remove perceived extended sources from the catalogues. Uses the following arguments 'askap-
ext-thresh' and 'sumss-ext-thresh' to set the threshold. (default: False)
--askap-ext-thresh ASKAP_EXT_THRESH
Define the maximum scaling threshold of the size of the ASKAP source compared to the PSF.
Used to exclude extended sources. Only 1 axis has to exceed. (default: 1.2)
--sumss-ext-thresh SUMSS_EXT_THRESH
Define the maximum scaling threshold of the size of the SUMSS source compared to the PSF.
Use to exclude extended sources. Only 1 axis has to exceed. (default: 1.2)
--nvss-ext-thresh NVSS_EXT_THRESH
Define the maximum scaling threshold of the size of the NVSS source compared to the PSF. Use
to exclude extended sources. Only 1 axis has to exceed. (default: 1.2)
--use-all-fits USE_ALL_FITS
Use all the fits from Aegean ignoring all flags. Default only those with flag '0' are used.
(default: False)
--write-ann WRITE_ANN
Create kvis annotation files of the catalogues. (default: False)
--produce-overlays PRODUCE_OVERLAYS
Create overlay figures of the sources on the ASKAP image. (default: True)
--boundary-value {nan,zero}
Define whether the out-of-bounds value in the ASKAP FITS is 'nan' or 'zero'. (default: nan)
--askap-flux-error ASKAP_FLUX_ERROR
Percentage error to apply to flux errors. (default: 0.0)
--diagnostic-max-separation DIAGNOSTIC_MAX_SEPARATION
Maximum crossmatch distance (in arcsec) to be consdiered when creating the diagnostic plots.
(default: 5.0)
--transient-max-separation TRANSIENT_MAX_SEPARATION
Maximum crossmatch distance (in arcsec) to be consdiered when searching for transients.
(default: 45.0)
--postage-stamps POSTAGE_STAMPS
Produce postage stamp plots of the cross matched sources within the max separation.
(default: False)
--postage-stamp-selection {all,transients}
Select which postage stamps to create. (default: all)
--postage-stamp-ncores POSTAGE_STAMP_NCORES
Select how many cores to use when creating the postage stamps. (default: 6)
--postage-stamp-radius POSTAGE_STAMP_RADIUS
Select the radius of the postage stamp cutouts (arcmin). (default: 13.0)
--postage-stamp-zscale-contrast POSTAGE_STAMP_ZSCALE_CONTRAST
Select the ZScale contrast to use in the postage stamps. (default: 0.2)
--sumss-mosaic-dir SUMSS_MOSAIC_DIR
Directory containing the SUMSS survey mosaic image files. (default: None)
--nvss-mosaic-dir NVSS_MOSAIC_DIR
Directory containing the NVSS survey mosaic image files. (default: None)
--aegean-settings-config AEGEAN_SETTINGS_CONFIG
Select a config file containing the Aegean settings to be used (instead of defaults if none
provided). (default: None)
--pybdsf-settings-config PYBDSF_SETTINGS_CONFIG
Select a config file containing the PyBDSF settings to be used (instead of defaults if none
provided). (default: None)
--selavy-settings-config SELAVY_SETTINGS_CONFIG
Select a config file containing the Selavy settings to be used (instead of defaults if none
provided). (default: None)
--transients TRANSIENTS
Perform a transient search analysis using the crossmatch data. Requires '--max-separation'
to be defined. (default: False)
--transients-askap-snr-thresh TRANSIENTS_ASKAP_SNR_THRESH
Define the threshold for which ASKAP sources are considered to not have a SUMSS match baseed
upon the estimated SUMSS SNR if the source was placed in the SUMSS image. (default: 5.0)
--transients-large-flux-ratio-thresh TRANSIENTS_LARGE_FLUX_RATIO_THRESH
Define the threshold for which sources are considered to have a large flux ratio. Median
value +/- threshold x std. (default: 3.0)
--db-inject DB_INJECT
Turn databse injection on or off. (default: True)
--db-engine DB_ENGINE
Define the database engine. (default: postgresql)
--db-username DB_USERNAME
Define the username to use for the database (default: postgres)
--db-password DB_PASSWORD
Define the password to use for the database (default: postgres)
--db-host DB_HOST Define the host for the databse. (default: localhost)
--db-port DB_PORT Define the port for the databse. (default: 5432)
--db-database DB_DATABASE
Define the name of the database. (default: postgres)
--db-tag DB_TAG The description field in the databased attached to the image. (default: RACS Analysis)
--website-media-dir WEBSITE_MEDIA_DIR
Copy the image directory directly to the static media directory of the website. (default:
none)
These options can be entered using a ConfigParser configuration file:
[GENERAL]
output_tag=askap_racs_analysis
log_level=INFO
nice=10
clobber=True
[ANALYSIS]
sumss_only=true
nvss_only=false
frequency=864e6
weight_crop=True
weight_crop_value=0.04
weight_crop_image=../path/to/weight_cropped_image.fits
convolve=True
convolved_image=../path/to/convolved_image.fits
convolved_non_conv_askap_csv=../path/to/preconvolved_catalog.csv_
convolved_non_conv_askap_islands_csv=/path/to/preconvolved_islands_catalog.csv_
sourcefinder=aegean
# aegean_settings_config=None
# pybdsf_settings_config=None
# selavy_settings_config=None
boundary_value=nan
askap_flux_error=0.1
[CATALOGUES]
askap_csv=/path/to/askap_catalog.csv
askap_islands_csv=/path/to/askap_islands_catalog.csv
# sumss_csv=None
# nvss_csv=None
askap_csv_format=aegean
write_ann=True
[CROSSMATCHING]
diagnostic_max_separation=5.0
transient_max_separation=45.0
remove_extended=True
askap_ext_thresh=1.3
sumss_ext_thresh=1.5
nvss_ext_thresh=1.2
use_all_fits=False
[TRANSIENTS]
transients=True
transients_askap_sumss_snr_thresh=5.0
transients_large_flux_ratio_thresh=2.0
[POSTAGESTAMPS]
postage_stamps=False
postage_stamp_selection=all
postage_stamp_ncores=6
postage_stamp_radius=13.0
postage_stamp_zscale_contrast=0.25
sumss_mosaic_dir=/directory/where/sumss/mosaics/are/kept
nvss_mosaic_dir=/directory/where/sumss/mosaics/are/kept
[DATABASE]
db_inject=true
db_engine=postgresql
db_username=user
db_host=localhost
db_port=5432
db_database=RACS
db_tag=Tag to add to pipeline run
website_media_dir=/path/to/the/website/media/dir
If the weights.XX.fits file is available and supplied then this can be used to trim the image to remove the edges, leaving just the cleaner part of the image. By default this is set to cut to a value of 0.04 of the maximum weight value.
The pipeline is able to convolve the supplied ASKAP image to that of the SUMSS or NVSS resolution. If the image has already been convolved it can supplied to the pipeline using the convolved_image
argument (note make sure to have 'convolved=True'). In the case of convolving to SUMSS the target beam size follows the 45 x 45 cosec |dec| convention. Cross matching is then done against the convolved image.
If convolving is used then the non-convolved image will also be analysed. The source will be extracted by the pipeline using Aegean or a catalogue can be supplied using the convolved_non_conv_askap_csv
argument. This is used when searching for transient sources.
These plots are produced by only using sources that have a crossmatch distance <= the user defined max separation, i.e. good matches. Also if --remove-extended
is enabled then extended sources are also removed from the list of crossmatches used to produce the diagnostic plots.
Note the source numbers plot does not make these exclusions.
The pipeline works by matching each SUMSS source in the image with the nearest ASKAP source extracted.
Good matches are deemed those that are <= the max separation defined by the user. Above this is considered to have no match. This provides 3 different sub-types of cross matches:
From here, force extractions are performed using Aegean where a source has not been found. This enables the flux ratio to be computed for each crossmatch source - no matter the sub type. Transient candidates are those sources which have a flux ratio >= 2.0.
In the top level directory will be:
Two directories may also be present:
Input: An ASKAP image called image.askap.mosaic.restored.fits
. The pixels outside of the image area are NaNs. Our database is on the localhost with the username of user123
, on the default port and is called racstest
. Our website media directory is /my/website/static/media
Want: To crossmatch the ASKAP image with SUMSS and use only matches that are <= 20 arcsec to perform the analysis. Allow the script to automatically build the catalogues and remove extended sources when creating the diagnostic plots. In this case, these are defined as sources that have one axis that is 1.4 X larger than the associated beam size axis. Also want to produce postage stamp images of the crossmatches along with producing kvis annotation files, and finally perform a transient search. We will mark the image in the database with first test
.
Command:
processASKAPimage.py image.askap.mosaic.restored.fits --remove-extended --askap-ext-thresh 1.4 --sumss-ext-thresh 1.4 --max-separation 20.0 --postage-stamps --sumss-mosaic-dir /path/to/sumss_mosaics_dir --write-ann --transients --db-username user123 --db-name racstest --db-tag "first test" --website-media-dir /my/website/static/media
Or we can use configure the parset file to run:
[GENERAL]
output_tag=example
log_level=INFO
nice=10
clobber=True
[ANALYSIS]
frequency=864e6
weight_crop=False
weight_crop_value=0.04
weight_crop_image=../path/to/weight_cropped_image.fits
convolve=False
convolved_image=../path/to/convolved_image.fits
convolved_non_conv_askap_csv=../path/to/preconvolved_catalog.csv_
sourcefinder=aegean
# aegean_settings_config=None
# pybdsf_settings_config=None
# selavy_settings_config=None
boundary_value=nan
askap_flux_error=0.1
[CATALOGUES]
askap_csv=/path/to/askap_catalog.csv
# sumss_csv=None
# nvss_csv=None
askap_csv_format=aegean
write_ann=True
[CROSSMATCHING]
diagnostic_max_separation=5.0
transient_max_separation=20.0
remove_extended=True
askap_ext_thresh=1.4
sumss_ext_thresh=1.4
nvss_ext_thresh=1.2
use_all_fits=False
[TRANSIENTS]
transients=True
transients_askap_sumss_snr_thresh=5.0
transients_large_flux_ratio_thresh=2.0
[POSTAGESTAMPS]
postage_stamps=True
postage_stamp_selection=all
postage_stamp_ncores=2
postage_stamp_radius=13.0
postage_stamp_zscale_contrast=0.25
sumss_mosaic_dir=/directory/where/sumss/mosaics/are/kept
nvss_mosaic_dir=/directory/where/sumss/mosaics/are/kept
[DATABASE]
db_engine=postgresql
db_username=user123
db_host=localhost
db_port=5432
db_database=racstest
db_tag=first test
website_media_dir=/path/to/the/website/media/dir
And then run the pipeline like so:
processASKAPimage.py -c myparset.in image.askap.mosaic.restored.fits
Output: The results will be placed in image.askap.mosaic.restored_results
.
The default aegean settings are:
cores=1
maxsummits=5
seedclip=5
floodclip=4
nocov=True
These can be changed by providing a config file and supplying it to the argument --aegean-settings-config
. There should be a standard ConfigParser header [aegean]
. E.g.:
[aegean]
cores=12
maxsummits=5
seedclip=6
floodclip=4
autoload=True
To deactivate a setting remove it from the config file.