ratt-ru / shadeMS

Rapid Measurement Set plotting with dask-ms and datashader
20 stars 6 forks source link

Error in plotting --xaxis FREQ vs --yaxis amp for MWA data #97

Closed devojyoti96 closed 1 year ago

devojyoti96 commented 2 years ago

While plotting MWA data with xaxis=frequency it shows the following error. The same dataset can be plotted with CASA plotms without any error. shadems --xaxis FREQ --yaxis CORRECTED_DATA:amp time_2014_05_04_01_26_04.50_freq_80.62.ms OUTPUT: 2021-11-15 10:38:57 - shadems - INFO - using colourmap colorcet.bkr 2021-11-15 10:38:57 - shadems - INFO - using colourmap cmasher.pride 2021-11-15 10:38:57 - shadems - INFO - using colourmap colorcet.glasbey_dark 2021-11-15 10:38:57 - shadems - INFO - /usr/local/bin/shadems --xaxis FREQ --yaxis CORRECTED_DATA:amp /data1/devojyoti/PhD/data/paircars_gui1/basedir_for_2014_05_04/time_2014_05_04_01_24_02.25_freq_79.98/time_2014_05_04_01_26_04.50_freq_80.62_ref_B/freq_80.62_datetime_2014_05_04_01_26_04.50_bp/backup_ms/time_2014_05_04_01_26_04.50_freq_80.62_ref_1.ms 2021-11-15 10:38:57 - shadems - INFO - ------------------------------------------------------ 2021-11-15 10:38:57 - shadems - INFO - : MS time_2014_05_04_01_26_04.50_freq_80.62.ms contains 8256 rows 2021-11-15 10:38:57 - shadems - INFO - : (1, 8) spectral windows and channels 2021-11-15 10:38:57 - shadems - INFO - : 1 fields: Sun 2021-11-15 10:38:57 - shadems - INFO - : 1 scans: 1 2021-11-15 10:38:57 - shadems - INFO - : 128/128 antennas: 0:Tile011 1:Tile012 2:Tile013 3:Tile014 4:Tile015 5:Tile016 6:Tile017 7:Tile018 8:Tile021 9:Tile022 10:Tile023 11:Tile024 12:Tile025 13:Tile026 14:Tile027 15:Tile028 16:Tile031 17:Tile032 18:Tile033 19:Tile034 20:Tile035 21:Tile036 22:Tile037 23:Tile038 24:Tile041 25:Tile042 26:Tile043 27:Tile044 28:Tile045 29:Tile046 30:Tile047 31:Tile048 32:Tile051 33:Tile052 34:Tile053 35:Tile054 36:Tile055 37:Tile056 38:Tile057 39:Tile058 40:Tile061 41:Tile062 42:Tile063 43:Tile064 44:Tile065 45:Tile066 46:Tile067 47:Tile068 48:Tile071 49:Tile072 50:Tile073 51:Tile074 52:Tile075 53:Tile076 54:Tile077 55:Tile078 56:Tile081 57:Tile082 58:Tile083 59:Tile084 60:Tile085 61:Tile086 62:Tile087 63:Tile088 64:Tile091 65:Tile092 66:Tile093 67:Tile094 68:Tile095 69:Tile096 70:Tile097 71:Tile098 72:Tile101 73:Tile102 74:Tile103 75:Tile104 76:Tile105 77:Tile106 78:Tile107 79:Tile108 80:Tile111 81:Tile112 82:Tile113 83:Tile114 84:Tile115 85:Tile116 86:Tile117 87:Tile118 88:Tile121 89:Tile122 90:Tile123 91:Tile124 92:Tile125 93:Tile126 94:Tile127 95:Tile128 96:Tile131 97:Tile132 98:Tile133 99:Tile134 100:Tile135 101:Tile136 102:Tile137 103:Tile138 104:Tile141 105:Tile142 106:Tile143 107:Tile144 108:Tile145 109:Tile146 110:Tile147 111:Tile148 112:Tile151 113:Tile152 114:Tile153 115:Tile154 116:Tile155 117:Tile156 118:Tile157 119:Tile158 120:Tile161 121:Tile162 122:Tile163 123:Tile164 124:Tile165 125:Tile166 126:Tile167 127:Tile168 2021-11-15 10:38:58 - shadems - INFO - : 8256/8256 baselines present 2021-11-15 10:38:58 - shadems - INFO - : corrs/Stokes XX YY I Q 2021-11-15 10:38:58 - shadems - INFO - ------------------------------------------------------ 2021-11-15 10:38:58 - shadems - INFO - : Data selected for plotting: 2021-11-15 10:38:58 - shadems - INFO - Antenna(s) : all 2021-11-15 10:38:58 - shadems - INFO - Baseline(s) : all except autocorrelations 2021-11-15 10:38:58 - shadems - INFO - Field(s) : all 2021-11-15 10:38:58 - shadems - INFO - SPW(s) : all 2021-11-15 10:38:58 - shadems - INFO - Scan(s) : all 2021-11-15 10:38:58 - shadems - INFO - Channels : all 2021-11-15 10:38:58 - shadems - INFO - Corr/Stokes : XX YY 2021-11-15 10:38:58 - shadems - INFO - ------------------------------------------------------ 2021-11-15 10:38:58 - shadems - INFO - axis: FREQ, range (None, None), discretization None 2021-11-15 10:38:58 - shadems - INFO - axis: amp(CORRECTED_DATA), corr None, range (None, None), discretization None 2021-11-15 10:38:58 - shadems - INFO - : you have asked for 1 plots employing 2 unique datums 2021-11-15 10:38:58 - shadems - INFO - : Indexing MS and building dataframes (8128 rows, chunk size is 5000) 2021-11-15 10:38:58 - shadems - INFO - : complete 2021-11-15 10:38:58 - shadems - INFO - : rendering 1 dataframes with 1.3e+05 points into 1 plot types 2021-11-15 10:38:58 - shadems - INFO - : rendering plot-time_2014_05_04_01_26_04.50_freq_80.62-CORRECTED_DATA-XX-YY-amp-FREQ.png 2021-11-15 10:38:58 - shadems - INFO - : scanning axis min/max for CORRECTED_DATA_amp_None Traceback (most recent call last): File "/usr/local/bin/shadems", line 4, in import('pkg_resources').run_script('shadems==0.5.0', 'shadems') File "/usr/local/lib/python3.7/site-packages/pkg_resources/init.py", line 667, in run_script self.require(requires)[0].run_script(script_name, ns) File "/usr/local/lib/python3.7/site-packages/pkg_resources/init.py", line 1464, in run_script exec(code, namespace, namespace) File "/usr/local/lib/python3.7/site-packages/shadems-0.5.0-py3.7.egg/EGG-INFO/scripts/shadems", line 8, in main.main([a for a in sys.argv[1:]]) File "/usr/local/lib/python3.7/site-packages/shadems-0.5.0-py3.7.egg/shade_ms/main.py", line 547, in main render_single_plot(df, subset, xdatum, ydatum, adatum, ared, cdatum, pngname, title, xlabel, ylabel) File "/usr/local/lib/python3.7/site-packages/shadems-0.5.0-py3.7.egg/shade_ms/main.py", line 496, in render_single_plot options=options) File "/usr/local/lib/python3.7/site-packages/shadems-0.5.0-py3.7.egg/shade_ms/data_plots.py", line 484, in create_plot aspect='auto', origin='lower', interpolation='nearest') File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/init.py", line 1352, in inner return func(ax, *map(sanitize_sequence, args), **kwargs) File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/axes/_axes.py", line 5599, in imshow im.set_extent(im.get_extent()) File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/image.py", line 956, in set_extent self.axes.set_xlim((xmin, xmax), auto=None) File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/axes/_base.py", line 3526, in set_xlim left = self._validate_converted_limits(left, self.convert_xunits) File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/axes/_base.py", line 3440, in _validate_converted_limits converted_limit = convert(limit) File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/artist.py", line 203, in convert_xunits return ax.xaxis.convert_units(x) File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/axis.py", line 1480, in convert_units if munits._is_natively_supported(x): File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/units.py", line 66, in _is_natively_supported for thisx in x: File "/usr/local/lib/python3.7/site-packages/dask/array/core.py", line 1482, in iter for i in range(len(self)): File "/usr/local/lib/python3.7/site-packages/dask/array/core.py", line 1356, in len raise TypeError("len() of unsized object") TypeError: len() of unsized object

bennahugo commented 1 year ago

can confirm this is hitting me with MeerKAT data as well

bennahugo commented 1 year ago

Fix proposed -- it looks like dask is unhappy about chunksize None on arrays when evaluating. @sjperkins maybe daskms should by default go to chunk size 1 for things like channel frequencies?

sjperkins commented 1 year ago

dask array's cannot have chunks=None:

In [1]: import dask.array as da

In [2]: da.ones((1000,1000),chunks=None)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [2], line 1
----> 1 da.ones((1000,1000),chunks=None)

File ~/.cache/pypoetry/virtualenvs/dask-ms-jCyuTJVk-py3.8/lib/python3.8/site-packages/dask/array/wrap.py:60, in wrap_func_shape_as_first_arg(func, *args, **kwargs)
     54 if isinstance(shape, Array):
     55     raise TypeError(
     56         "Dask array input not supported. "
     57         "Please use tuple, list, or a 1D numpy array instead."
     58     )
---> 60 parsed = _parse_wrap_args(func, args, kwargs, shape)
     61 shape = parsed["shape"]
     62 dtype = parsed["dtype"]

File ~/.cache/pypoetry/virtualenvs/dask-ms-jCyuTJVk-py3.8/lib/python3.8/site-packages/dask/array/wrap.py:30, in _parse_wrap_args(func, args, kwargs, shape)
     27     dtype = func(shape, *args, **kwargs).dtype
     28 dtype = np.dtype(dtype)
---> 30 chunks = normalize_chunks(chunks, shape, dtype=dtype)
     32 name = name or funcname(func) + "-" + tokenize(
     33     func, shape, chunks, dtype, args, kwargs
     34 )
     36 return {
     37     "shape": shape,
     38     "dtype": dtype,
   (...)
     41     "name": name,
     42 }

File ~/.cache/pypoetry/virtualenvs/dask-ms-jCyuTJVk-py3.8/lib/python3.8/site-packages/dask/array/core.py:3030, in normalize_chunks(chunks, shape, limit, dtype, previous_chunks)
   3028     dtype = np.dtype(dtype)
   3029 if chunks is None:
-> 3030     raise ValueError(CHUNKS_NONE_ERROR_MESSAGE)
   3031 if isinstance(chunks, list):
   3032     chunks = tuple(chunks)

ValueError: You must specify a chunks= keyword argument.
This specifies the chunksize of your array blocks.

See the following documentation page for details:
  https://docs.dask.org/en/latest/array-creation.html#chunks
bennahugo commented 1 year ago

@sjperkins Specifically this is being triggered - inside dask.array.core

    def __len__(self):
        if not self.chunks:
            raise TypeError("len() of unsized object")

by an array with the following shape and chunking:

'xmin': dask.array<amin-aggregate, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>
sjperkins commented 1 year ago

It looks like the matplotlib code is treating it as an iterable (for thisx in x):

File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/units.py", line 66, in _is_natively_supported
for thisx in x:
File "/usr/local/lib/python3.7/site-packages/dask/array/core.py", line 1482, in iter
for i in range(len(self)):
File "/usr/local/lib/python3.7/site-packages/dask/array/core.py", line 1356, in len
raise TypeError("len() of unsized object")
TypeError: len() of unsized object

Note that this fails for a reduction on a numpy array too:

----> 1 for x in np.ones((100, 100)).sum():
      2     print(x)

TypeError: 'numpy.float64' object is not iterable

Perhaps the mistake is that matplotlib is being passed a single value rather than an iterable?

bennahugo commented 1 year ago

Hmm something in the API must have changed. The limits either way should not be an iterable so I think this is the correct fix to apply.

Let me know if you want something changed - there should be no breaking changes with this - I've tried to keep it completely flexible on the axes

On Wed, 05 Oct 2022, 16:43 Simon Perkins, @.***> wrote:

It looks like the matplotlib code is treating it as an iterable (for thisx in x):

File "/home/devojyoti/.local/lib/python3.7/site-packages/matplotlib/units.py", line 66, in _is_natively_supportedfor thisx in x:File "/usr/local/lib/python3.7/site-packages/dask/array/core.py", line 1482, in iterfor i in range(len(self)):File "/usr/local/lib/python3.7/site-packages/dask/array/core.py", line 1356, in lenraise TypeError("len() of unsized object")TypeError: len() of unsized object

Note that this fails for a reduction on a numpy array too:

----> 1 for x in np.ones((100, 100)).sum(): 2 print(x) TypeError: 'numpy.float64' object is not iterable

Perhaps the mistake is that matplotlib is being passed a single value rather than an iterable?

— Reply to this email directly, view it on GitHub https://github.com/ratt-ru/shadeMS/issues/97#issuecomment-1268539181, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RE6XDIYFYMM25256ZHJDWBWH23ANCNFSM5IAZNK6A . You are receiving this because you commented.Message ID: @.***>