rubyvanrooyen / data_processing

Astronomy data processing scripts
0 stars 0 forks source link

Setup

Installation

Python virtual env
Depending on the python version you have, some notes on setting up a python virtual environment dedicated to getting data out of the archive

python3.x -m venv venv3.x

e.g.

python3.6 -m venv venv3.6
python3.7 -m venv venv3.7

make your python virtual env version specific

python3.7 -m venv venv3.7
source venv3.7/bin/activate

upgrade venv default pip version for newer wheels

pip install -U pip setuptools wheel

exit venv

deactivate

Data

Installation

activate python virtual environment

source venv3.7/bin/activate

install katdal for data extraction

pip install katdal
pip install python-casacore
pip cache purge
deactivate

exit venv

deactivate

Extraction

Get data from archive
It is suggested since we want to map the wideband results onto the narrow band data without too much effort, to ensure that we do the MS conversion of both datasets consistently Using screen/tmux session, e.g. tmux new -s 3c39 or

tmux ls
tmux a -t 3c39

activate python virtual

source venv3.7/bin/activate

extract wideband data

mvftoms.py -f --flags cam <katdaltoken>

extract narrowband data

mvftoms.py -a -f --flags cam -C <chan,range> <katdaltoken>

make ms read-only

chmod -w <msfile>.ms/

[optional] create symbolic link for easy access

ln -s data/<msfile>.ms

exit venv

deactivate

Pipeline processing

Installation

Install caracal

activate python virtual

source venv3.7/bin/activate

install processing pipeline software

# old branch has error in setjy models
pip install --no-cache-dir git+https://github.com/caracal-pipeline/caracal.git@add_polcal
# install newest master to correct the setjy model error
pip install --no-cache-dir git+https://github.com/caracal-pipeline/caracal.git
caracal -h
pip cache purge
deactivate

exit venv

deactivate

Processing

config file

Construct the caracal pipeline config file (<whatever>.yml)
Example configuration files in configs folder

activate python virtual

source venv3.7/bin/activate

caracal

Run pipeline

ln -s <scratch_data> ms-orig

caracal -c <config_file>.yml

# sw : start-worker
# ew : end-worker
caracal -c <config_file>.yml -sw general -ew transform__calibrators

To quickly clean up the temporary output products generated by caracal

make clean

To remove all output, as well as

make clobber

exit venv

deactivate

Visualisation

Caracal pipeline produce diagnostic plots in a Jupyter notebook format and fits images. The easiest way to view these plots is using radiopadre

Installation

Installing radiopadre on a remote system to view caracal output on server See full description in Gdoc Radiopadre and Radiovangelize

activate python virtual

On linux

source venv3.7/bin/activate

On mac

pyenv local 3.7.10

install radiopadre

pip install git+https://github.com/ratt-ru/radiopadre-client.git@b1.2.pre2

start server

On server

run-radiopadre -V --auto-init .

On remote/local system

run-radiopadre -V com14.science.kat.ac.za: --auto-init

This should install the server side software as well as spin up the first notebook and carta instances and open browser tabs if a browser is available.

debug/verification

you can see the system is listening on the relevant ports

ss -tpl

when you exit the session, the ports allocated will be released, but as usual this will take a little time. check which ports are still located with

ss -a
ss -a | grep TIME-

The number of ports should decrease as the resources are released

ss -a | grep TIME- | wc -l

Usage

On server

run-radiopadre -V .

On remote/local system

run-radiopadre -V com14.science.kat.ac.za:

Notes

If you have radiopadre running on the server side, but want to view it on a local system, use port forwarding

ssh -XY -L 5050:localhost:1024 -L 5051:localhost:1027 -L 5052:localhost:1028 com14
http://localhost:5050/tree?#notebooks
http://localhost:5052/?socketUrl=ws://localhost:5051

Uninstall

pip uninstall radiopadre-client
pip uninstall radiopadre

-fin-