Closed bennahugo closed 1 year ago
As a temporary kludge it would be necessary to run wsclean before this and store the computed weights in an IMAGING_WEIGHT_SPECTRUM column to force briggs weighting on the visibilties
-store-imaging-weights
Will store the imaging weights in a column named 'IMAGING_WEIGHT_SPECTRUM'
I would be happy to have this capability but it shouldn't be necessary if the data is clean enough (by which I mean mostly free of RFI and calibration errors). In fact, you should do substantially better on fields with lots of fluffy stuff with just natural weighting. The problem is actually point sources but I am adding a feature to deal with that soon
The problem is you still want to control resolution of the extended emission. Specifically for this field we are looking at emission that is sub resolution of the natural weighted beam. Also as I said we need to be extremely careful with the bright fluffy stuff and its contamination onto the extended emission surrounding
I added my old uniform weighting code here. @bennahugo @o-smirnov I would appreciate it if you could have a look when you have a chance as I am not 100% sure it is doing the right thing (dirty images not exactly the same as wsclean's uniform weighting). Also, it is currently very slow. Suggestions on how to speed it up are welcome (also tagging @sjperkins @JSKenyon)
I am fiddling with ways to speed it up - fairly confident I can come up with something.
I had a quick look at this. I think your indexing is swapped (could explain your tragic cache performance as well) and your cells are not in radians (I think??). You also should also round I think and not snap to the left most grid point, if I understand the code correctly.
Instead of reinventing the wheel why not use the perley gridder in nearest neighbour mode and grid unity * weight onto the grid to get the uniform counts?
Cheers,
On Thu, Dec 10, 2020 at 11:16 AM JSKenyon notifications@github.com wrote:
I am fiddling with ways to speed it up - fairly confident I can come up with something.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ratt-ru/pfb-clean/issues/30#issuecomment-742390182, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RE6XMMUUOILDOWI6SZTDSUCGVHANCNFSM4TIPLK2Q .
Benjamin Hugo
PhD. student, Centre for Radio Astronomy Techniques and Technologies Department of Physics and Electronics Rhodes University
Junior software developer Radio Astronomy Research Group South African Radio Astronomy Observatory Black River Business Park Observatory Cape Town
Checking in with my current best effort:
@njit(nogil=True, fastmath=True, cache=True)
def compute_wsums_band(uvw, weights, freqs, nx, ny, cell_size_x, cell_size_y,
dtype):
# get u coordinates of the grid
umax = 1.0/cell_size_x
# get end points
ug = np.linspace(-umax, umax, nx + 1)[1:]
u_diff = ug[1] - ug[0]
# get v coordinates of the grid
vmax = 1.0/cell_size_y
# get end points
vg = np.linspace(-vmax, vmax, ny + 1)[1:]
v_diff = vg[1] - vg[0]
# initialise array to store counts
counts = np.zeros((nx, ny), dtype=dtype)
# accumulate counts
nchan = freqs.size
nrow = uvw.shape[0]
normfreqs = freqs / 299792458
for r in range(nrow):
uvw_row = uvw[r]
for c in range(nchan):
# get current uv coords
u_tmp = uvw_row[0] * normfreqs[c]
v_tmp = uvw_row[1] * normfreqs[c]
# get u index
u_idx = int((u_tmp + umax)//u_diff - 1)
# get v index
v_idx = int((v_tmp + vmax)//v_diff - 1)
counts[u_idx, v_idx] += weights[r, c]
return counts
which is about 40x faster on my laptop, for about a GB of weights on a 1024x1024 grid. I have confirmed that this produces the same counts array as the existing code. Known issues include clamping - things will break if the coordinates lie outside the grid. The most expensive operation is now the sum into counts - this makes a degree of sense as the sums will be into effectively random locations. This can be very slightly improved by sorting uvw but it probably isn't worthwhile.
While I am sure everyone can figure out what I am doing, for the sake of my own memory, this directly computes the index at which the coordinate would appear given the original partitioning. The minus one is a bit of a fudge and can likely be removed by fiddling elsewhere, but I will leave that to your discretion @landmanbester.
I was mistaken - you would think I would have learned my lesson by now, but never trust you know where the bottleneck is when using compiled code. Here is another revision which is now ~500x faster than the original:
@njit(nogil=True, fastmath=True, cache=True)
def compute_wsums_band(uvw, weights, freqs, nx, ny, cell_size_x, cell_size_y,
dtype):
# get u coordinates of the grid
umax = 1.0/cell_size_x
# get end points
ug = np.linspace(-umax, umax, nx + 1)[1:]
u_diff = ug[1] - ug[0]
# get v coordinates of the grid
vmax = 1.0/cell_size_y
# get end points
vg = np.linspace(-vmax, vmax, ny + 1)[1:]
v_diff = vg[1] - vg[0]
# initialise array to store counts
counts = np.zeros((nx, ny), dtype=dtype)
# accumulate counts
nchan = freqs.size
nrow = uvw.shape[0]
normfreqs = freqs / 3e8 # Can be replaced with more accurate value.
for r in range(nrow):
uvw_row = uvw[r]
for c in range(nchan):
# get current uv coords
chan_normfreq = normfreqs[c]
u_tmp = uvw_row[0] * chan_normfreq
v_tmp = uvw_row[1] * chan_normfreq
# get u index
u_idx = int(np.floor_divide(u_tmp + umax, u_diff)) - 1
# get v index
v_idx = int(np.floor_divide(v_tmp + vmax, v_diff)) - 1
counts[u_idx, v_idx] += weights[r, c]
return counts
Note that your existing code is actually faster than mine for very small problem sizes (<<1000 rows), but scales horribly with nrow due to the array comparison. This approach doesn't do any more work per iteration as nrow grows, so should scale out nicely. Anyhow, think ~500x is the point when I am willing to say good enough.
for r in range(nrow): uvw_row = uvw[r] for c in range(nchan): # get current uv coords chan_normfreq = normfreqs[c] u_tmp = uvw_row[0] * chan_normfreq v_tmp = uvw_row[1] * chan_normfreq # get u index u_idx = int(np.floor_divide(u_tmp + umax, u_diff)) - 1 # get v index v_idx = int(np.floor_divide(v_tmp + vmax, v_diff)) - 1 counts[u_idx, v_idx] += weights[r, c] return counts
The order of magnitude comes from:
chan_normfreq
into a register np.floor_divide
instead of //Woah, ok. Thanks @JSKenyon, let me have a go with that. It does not have to be super optimal since I only have to compute the weights at the outset. I was more worried about the correctness of it all. @bennahugo, from what I understand about uniform weighting, you are supposed to accumulate the weights in each Fourier cell and then use the inverse to weight each data point. Here I just find the grid point corresponding to uv by checking what the last grid point is that is less than uv. Is that not correct? What is the difference between uniform weighting and super uniform weighting again?
The order of magnitude comes from:
1. moving `chan_normfreq` into a register 2. using `np.floor_divide` instead of // 3. anything else?
Yeah, chan_normfreq
gets you roughly a factor of two (you avoid looking it up twice). Using np.floor_divide
seems to be ~5x faster than normal integer division. My best guess for that is that the numpy routine is better optimised, but I would need to dig into the lower level stuff to really know. You can also shave time off both implementations by doing export OPENBLAS_NUM_THREADS=1
, but that is overkill. Any additional speed up comes from me changing they way the inner loop works.
The approach should be relatively straightforward to apply to the other kernels, but shout if you need help @landmanbester. I should reiterate that I just matched what was there - I don't know enough about gridding to comment on correctness.
Using
np.floor_divide
seems to be ~5x faster than normal integer division.
This is useful to know!
Here I just find the grid point corresponding to uv by checking what the last grid point is that is less than uv. Is that not correct?
Shouldn't you be rounding properly?
What is the difference between uniform weighting and super uniform weighting again?
In superuniform you make the grid cells smaller (or larger) by a factor of N. It's like computing uniform weighting for an image that is larger (or smaller) by a factor N.
Shouldn't you be rounding properly?
Well I think that is what @bennahugo also suggested but I am not sure what you mean by rounding. Am I not just supposed to accumulate the weights of all uv coordinates falling in a cell?
Yeah true, you need to divide by cellsize and simply round down. I was confused for a moment.
Should be working now, will appreciate if someone can give it a test drive
This went in a while ago. Please reopen as/if you encounter issues @bennahugo
Think this is the most essential option that is currently missing. There is no way to currently force the beam to a more Gaussian form so that flank contamination can be eliminated. This is a bit of a problem working in the plane where there is a ton of background diffuse emission that can be hidden by foreground HII, etc.
As discussed at the end here: https://skaafrica.atlassian.net/wiki/spaces/ESDKB/pages/41091107/Sensitivity+calculators