Open o-smirnov opened 3 years ago
Could you try increasing the row chunks up an order of magnitude or two?
If the number of channels is low, the amount of data per chunk, and hence aggregation step, will be small.
On Wed, 11 Nov 2020, 17:13 Oleg Smirnov, notifications@github.com wrote:
Assigned #81 https://github.com/ratt-ru/shadeMS/issues/81 to @sjperkins https://github.com/sjperkins.
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ratt-ru/shadeMS/issues/81#event-3984256676, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA253ZFF37VRT6QNZLNSMY3SPKSVNANCNFSM4TSC7SLQ .
Already tried that (5000, 50000, 500000), but didn't have any appreciable impact. Note that only the UVW column is read, and it doesn't have channels (and can be read in it entirety in <1s using casacore.tables...)
I'll put it on the queue
On Wed, Nov 11, 2020 at 6:53 PM Oleg Smirnov notifications@github.com wrote:
Already tried that (5000, 50000, 500000), but didn't have any appreciable impact. Note that only the UVW column is read, and it doesn't have channels (and can be read in it entirety in <1s using casacore.tables...)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ratt-ru/shadeMS/issues/81#issuecomment-725536137, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA253ZGOBYMOXCOC7ACGUXLSPK6SBANCNFSM4TSC7SLQ .
For reference, I'll link to a casacore issue I created (long) time ago. The results of your experiments seem to indicate this issue may be related: https://github.com/casacore/casacore/issues/800
@sjperkins, I suspect something about the row ordering is not playing right with dask-ms here. A paltry 5.71e+08 data points (5GB MS) takes ~180 seconds to plot. MeerKAT MSs take the same time at x100 the size, so something is off...
Simple UV plot, no data column involved even:
MS is under
/net/simon/home/oms/projects/sms-testing
.