ratt-ru / shadeMS

Rapid Measurement Set plotting with dask-ms and datashader
20 stars 6 forks source link

Ensure matplotlib bounds are computed for baseline arrays and fix regression in iter-baselines (#97 #105) #104

Closed bennahugo closed 1 year ago

bennahugo commented 1 year ago

Closes #97 Closes #105

@sharmilagoedhart please verify that this is working for you

This should now work: shadems --xaxis FREQ --yaxis DATA:amp msdir/DEEP_2.1491291289.1ghz.1.1ghz.4hrs.1gc.ms and this as well shadems --xaxis WAVEL --yaxis DATA:amp msdir/DEEP_2.1491291289.1ghz.1.1ghz.4hrs.1gc.ms

SharmilaGoedhart commented 1 year ago

Yes, it now runs. plot-bpcal-av-noflags-DATA-YY-amp-FREQ plot-bpcal-av-noflags-DATA-YY-amp-CHAN plot-bpcal-av-noflags-DATA-YY-amp-WAVEL

SharmilaGoedhart commented 1 year ago

But - those plots should not be so different. All I changed was the x-axis. shadems --xaxis CHAN --yaxis DATA:amp --corr YY bpcal-av-noflags.ms/ shadems --xaxis FREQ --yaxis DATA:amp --corr YY bpcal-av-noflags.ms/ shadems --xaxis WAVEL --yaxis DATA:amp --corr YY bpcal-av-noflags.ms/

SharmilaGoedhart commented 1 year ago

--iter-baseline does not work

$ shadems --xaxis FREQ --yaxis DATA:amp --corr YY --iter-baseline bpcal-av-noflags.ms/ 2022-10-05 16:20:33 - shadems - INFO - using colourmap colorcet.bkr 2022-10-05 16:20:33 - shadems - INFO - using colourmap cmasher.pride 2022-10-05 16:20:33 - shadems - INFO - using colourmap colorcet.glasbey_dark 2022-10-05 16:20:33 - shadems - INFO - /scratch2/sharmila/venv-shadems/bin/shadems --xaxis FREQ --yaxis DATA:amp --corr YY --iter-baseline bpcal-av-noflags.ms/ 2022-10-05 16:20:33 - shadems - INFO - ------------------------------------------------------ 2022-10-05 16:20:33 - shadems - INFO - : MS bpcal-av-noflags.ms contains 1830 rows 2022-10-05 16:20:33 - shadems - INFO - : (1, 32768) spectral windows and channels 2022-10-05 16:20:33 - shadems - INFO - : 1 fields: J1939-6342 2022-10-05 16:20:33 - shadems - INFO - : 1 scans: 1 2022-10-05 16:20:33 - shadems - INFO - : 60/60 antennas: 0:m000 1:m001 2:m002 3:m003 4:m004 5:m005 6:m006 7:m007 8:m008 9:m009 10:m010 11:m011 12:m012 13:m013 14:m014 15:m015 16:m016 17:m017 18:m018 19:m019 20:m020 21:m021 22:m022 23:m023 24:m024 25:m025 26:m026 27:m030 28:m031 29:m032 30:m033 31:m034 32:m035 33:m036 34:m037 35:m038 36:m039 37:m040 38:m041 39:m042 40:m043 41:m044 42:m045 43:m046 44:m047 45:m048 46:m049 47:m050 48:m051 49:m052 50:m053 51:m054 52:m055 53:m057 54:m058 55:m059 56:m060 57:m061 58:m062 59:m063 2022-10-05 16:20:33 - shadems - INFO - : 1830/1830 baselines present 2022-10-05 16:20:33 - shadems - INFO - : corrs/Stokes XX YY I Q 2022-10-05 16:20:33 - shadems - INFO - ------------------------------------------------------ 2022-10-05 16:20:33 - shadems - INFO - : Data selected for plotting: 2022-10-05 16:20:33 - shadems - INFO - Antenna(s) : all 2022-10-05 16:20:33 - shadems - INFO - Baseline(s) : all except autocorrelations 2022-10-05 16:20:33 - shadems - INFO - Field(s) : all 2022-10-05 16:20:33 - shadems - INFO - SPW(s) : all 2022-10-05 16:20:33 - shadems - INFO - Scan(s) : all 2022-10-05 16:20:33 - shadems - INFO - Channels : all 2022-10-05 16:20:33 - shadems - INFO - Corr/Stokes : YY 2022-10-05 16:20:33 - shadems - INFO - ------------------------------------------------------ 2022-10-05 16:20:33 - shadems - INFO - loading minmax cache from bpcal-av-noflags-minmax-cache.json 2022-10-05 16:20:33 - shadems - INFO - axis: FREQ, range (None, None), discretization None 2022-10-05 16:20:33 - shadems - INFO - axis: amp(DATA), corr None, range (None, None), discretization None 2022-10-05 16:20:33 - shadems - INFO - : you have asked for 1 plots employing 2 unique datums 2022-10-05 16:20:43 - shadems - INFO - : Indexing MS and building dataframes (1770 rows, chunk size is 5000) Traceback (most recent call last): File "/scratch2/sharmila/venv-shadems/bin/shadems", line 8, in main.main([a for a in sys.argv[1:]]) File "/scratch2/sharmila/venv-shadems/lib/python3.8/site-packages/shade_ms/main.py", line 419, in main data_plots.get_plot_data(ms, group_cols, mytaql, ms.chan_freqs, File "/scratch2/sharmila/venv-shadems/lib/python3.8/site-packages/shade_ms/data_plots.py", line 155, in get_plot_data a1 = da.minimum(group.ANTENNA1.data, group.ANTENNA2.data) AttributeError: 'int' object has no attribute 'data'

bennahugo commented 1 year ago

hmm but the second plot you are showing is wavelength (which would be reversed and have a somewhat non-linear step - with the top part of the band being the longer/bigger wavelength) - is CHAN showing up as WAVEL? That must be a bug let met check.

bennahugo commented 1 year ago

I think you are confusing your plots @SharmilaGoedhart - I can't reproduce your first point. This is what I get for chan: image This is what I get for freq: image

SharmilaGoedhart commented 1 year ago

@bennahugo look at the RFI, and the 1420 HI line. A lot of data points missing in the second and third plots, but differently.

This data has no flags applied, btw.

Yes, my comments have the commands the wrong way around. I'll edit the original post.

bennahugo commented 1 year ago

Looking more carefully at this imshow is probably not what you want to use for this sort of plotting though? It does create interpolation artefacts which is what I think we are actually seeing more than "unflagged" rfi. @o-smirnov will need to chip in because this is his code?

I suppose you are doing this for speed more than anything else?

bennahugo commented 1 year ago

I can confirm I can at least reproduce the --iter-baseline issue

SharmilaGoedhart commented 1 year ago

Yes, we (sci-ops) need to assess potentially problematic data rapidly. My first instinct is to look at the raw visibilities when I don't know what is going on.

bennahugo commented 1 year ago

@SharmilaGoedhart ok you can try again with iterbaseline

bennahugo commented 1 year ago

Ok I think this addresses the comments @sjperkins. I think going forward we can use titles plus issues in parentheses - I find quick cross references to original tickets easier to follow than digging through commit history when squashed?

sjperkins commented 1 year ago

Ok I think this addresses the comments @sjperkins. I think going forward we can use titles plus issues in parentheses - I find quick cross references to original tickets easier to follow than digging through commit history when squashed?

Normally one would add Closes #xxxx to the first comment (I've added this to your comment to demonstrate) Then someone can navigate to the PR and then to the issues if they so desire. This adds another level of indirection, but I suspect its slightly faster than manually taking the issue numbers and completing the issues url. What do you thin?

bennahugo commented 1 year ago

Ok I can accept this flow -- as long as it is clear what has been addressed in each PR

bennahugo commented 1 year ago

I think lets wait for @SharmilaGoedhart to look at her plots this morning before merging this one. If she is happy then I'm happy to proceed with merging.

I've had some thoughts about her comments re imshow interpolation tending to smeer things into neighbouring bins, but this is a discussion for a different PR and one I need to have with @IanHeywood and @o-smirnov

sjperkins commented 1 year ago

Ok I can accept this flow -- as long as it is clear what has been addressed in each PR

Also, adding Closes #xxxx will add navigable links to the sidebar:

Screenshot 2022-10-06 091757

I think lets wait for @SharmilaGoedhart to look at her plots this morning before merging this one. If she is happy then I'm happy to proceed with merging.

OK, I'll approve in the meantime