ratt-ru / shadeMS

Rapid Measurement Set plotting with dask-ms and datashader
21 stars 6 forks source link

SPW / DATA_DESC_ID plotting doesn't seem to work #84

Open IanHeywood opened 3 years ago

IanHeywood commented 3 years ago

I'm trying to plot the model column for all sources from a MS with 8 SPWs. I'm using this container:

/software/astro/caracal/STIMELA_IMAGES_1.6.8/stimela_shadems_1.7.1.sif

which contains:

$ shadems --version
shadems version 0.4.0

This command:

$ shadems --xaxis FREQ --yaxis MODEL_DATA:amp:XX --colour-by FIELD_ID 1601142995_calibrators.ms

finished, but took about an hour, and only plotted the first SPW.

Trying to colour by SPW:

shadems --xaxis FREQ --yaxis MODEL_DATA:amp:XX --colour-by DATA_DESC_ID 1601142995_calibrators.ms

causes an outright crash:

2020-11-30 18:23:33 - shadems - INFO - ------------------------------------------------------
2020-11-30 18:23:33 - shadems - INFO - : Data selected for plotting:
2020-11-30 18:23:33 - shadems - INFO - Antenna(s)       : all
2020-11-30 18:23:33 - shadems - INFO - Baseline(s)      : all except autocorrelations
2020-11-30 18:23:33 - shadems - INFO - Field(s)         : all
2020-11-30 18:23:33 - shadems - INFO - SPW(s)           : all
2020-11-30 18:23:33 - shadems - INFO - Scan(s)          : all
2020-11-30 18:23:33 - shadems - INFO - Channels         : all
2020-11-30 18:23:33 - shadems - INFO - Corr/Stokes      : XX XY YX YY
2020-11-30 18:23:33 - shadems - INFO - ------------------------------------------------------
2020-11-30 18:23:33 - shadems - INFO - loading minmax cache from 1601142995_calibrators-minmax-cache.json
2020-11-30 18:23:33 - shadems - INFO - axis: FREQ, range (None, None), discretization None
2020-11-30 18:23:33 - shadems - INFO - axis: amp(MODEL_DATA), corr 0, range (None, None), discretization None
2020-11-30 18:23:33 - shadems - INFO - axis: _(DATA_DESC_ID), corr False, range (None, None), discretization 16
2020-11-30 18:23:33 - shadems - INFO -                  : you have asked for 1 plots employing 3 unique datums
2020-11-30 18:23:36 - shadems - INFO - : Indexing MS and building dataframes (2049600 rows, chunk size is 5000)
^[[B^[[B2020-11-30 19:01:50 - shadems - INFO - : complete
2020-11-30 19:01:50 - shadems - INFO - : rendering 1 dataframes with 1.05e+09 points into 1 plot types
2020-11-30 19:01:50 - shadems - INFO - : rendering plot-1601142995_calibrators-MODEL_DATA-XX-amp-FREQ-DATA_DESC_ID.png
2020-11-30 19:01:50 - shadems - INFO - : scanning axis min/max for MODEL_DATA_amp_0 DATA_DESC_ID___False

Traceback (most recent call last):
  File "/usr/local/bin/shadems", line 8, in <module>
    main.main([a for a in sys.argv[1:]])
  File "/usr/local/lib/python3.6/dist-packages/shade_ms/main.py", line 795, in main
    render_single_plot(df, subset, xdatum, ydatum, adatum, ared, cdatum, pngname, title, xlabel, ylabel)
  File "/usr/local/lib/python3.6/dist-packages/shade_ms/main.py", line 744, in render_single_plot
    options=options)
  File "/usr/local/lib/python3.6/dist-packages/shade_ms/data_plots.py", line 345, in create_plot
    active_subset = OrderedDict(enumerate(map(str, range(bounds[caxis][1]+1))))
TypeError: 'numpy.float64' object cannot be interpreted as an integer
IanHeywood commented 3 years ago

You can see the time being spent there between the indexing and the rendering. This is on an IDIA node that I have (I believe) reserved in its entirety via:

$ salloc --time=72:00:00 --partition=Main --ntasks=1 --nodes=1 --cpus-per-task=32 --mem=230GB
IanHeywood commented 3 years ago

Same behaviour on a VLA P-band MS.

IanHeywood commented 3 years ago

(Except it didn't take an hour, it took about 75 seconds, on my desktop machine.)