ratt-ru / pfb-imaging

Preconditioned forward/backward clean algorithm
MIT License
7 stars 5 forks source link

Add briggs weighting options to control beamshape for mk data #30

Closed bennahugo closed 1 year ago

bennahugo commented 3 years ago

Think this is the most essential option that is currently missing. There is no way to currently force the beam to a more Gaussian form so that flank contamination can be eliminated. This is a bit of a problem working in the plane where there is a ton of background diffuse emission that can be hidden by foreground HII, etc.

As discussed at the end here: https://skaafrica.atlassian.net/wiki/spaces/ESDKB/pages/41091107/Sensitivity+calculators

bennahugo commented 3 years ago

As a temporary kludge it would be necessary to run wsclean before this and store the computed weights in an IMAGING_WEIGHT_SPECTRUM column to force briggs weighting on the visibilties

-store-imaging-weights
   Will store the imaging weights in a column named 'IMAGING_WEIGHT_SPECTRUM'
landmanbester commented 3 years ago

I would be happy to have this capability but it shouldn't be necessary if the data is clean enough (by which I mean mostly free of RFI and calibration errors). In fact, you should do substantially better on fields with lots of fluffy stuff with just natural weighting. The problem is actually point sources but I am adding a feature to deal with that soon

bennahugo commented 3 years ago

The problem is you still want to control resolution of the extended emission. Specifically for this field we are looking at emission that is sub resolution of the natural weighted beam. Also as I said we need to be extremely careful with the bright fluffy stuff and its contamination onto the extended emission surrounding

landmanbester commented 3 years ago

I added my old uniform weighting code here. @bennahugo @o-smirnov I would appreciate it if you could have a look when you have a chance as I am not 100% sure it is doing the right thing (dirty images not exactly the same as wsclean's uniform weighting). Also, it is currently very slow. Suggestions on how to speed it up are welcome (also tagging @sjperkins @JSKenyon)

JSKenyon commented 3 years ago

I am fiddling with ways to speed it up - fairly confident I can come up with something.

bennahugo commented 3 years ago

I had a quick look at this. I think your indexing is swapped (could explain your tragic cache performance as well) and your cells are not in radians (I think??). You also should also round I think and not snap to the left most grid point, if I understand the code correctly.

Instead of reinventing the wheel why not use the perley gridder in nearest neighbour mode and grid unity * weight onto the grid to get the uniform counts?

Cheers,

On Thu, Dec 10, 2020 at 11:16 AM JSKenyon notifications@github.com wrote:

I am fiddling with ways to speed it up - fairly confident I can come up with something.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ratt-ru/pfb-clean/issues/30#issuecomment-742390182, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4RE6XMMUUOILDOWI6SZTDSUCGVHANCNFSM4TIPLK2Q .

--

Benjamin Hugo

PhD. student, Centre for Radio Astronomy Techniques and Technologies Department of Physics and Electronics Rhodes University

Junior software developer Radio Astronomy Research Group South African Radio Astronomy Observatory Black River Business Park Observatory Cape Town

JSKenyon commented 3 years ago

Checking in with my current best effort:

@njit(nogil=True, fastmath=True, cache=True)
def compute_wsums_band(uvw, weights, freqs, nx, ny, cell_size_x, cell_size_y, 
                       dtype):
    # get u coordinates of the grid 
    umax = 1.0/cell_size_x
    # get end points
    ug = np.linspace(-umax, umax, nx + 1)[1:]
    u_diff = ug[1] - ug[0]
    # get v coordinates of the grid
    vmax = 1.0/cell_size_y
    # get end points
    vg = np.linspace(-vmax, vmax, ny + 1)[1:]
    v_diff = vg[1] - vg[0]
    # initialise array to store counts
    counts = np.zeros((nx, ny), dtype=dtype)
    # accumulate counts
    nchan = freqs.size
    nrow = uvw.shape[0]

    normfreqs = freqs / 299792458

    for r in range(nrow):
        uvw_row = uvw[r]
        for c in range(nchan):
            # get current uv coords
            u_tmp = uvw_row[0] * normfreqs[c]
            v_tmp = uvw_row[1] * normfreqs[c]
            # get u index
            u_idx = int((u_tmp + umax)//u_diff - 1)
            # get v index
            v_idx = int((v_tmp + vmax)//v_diff - 1)
            counts[u_idx, v_idx] += weights[r, c]
    return counts

which is about 40x faster on my laptop, for about a GB of weights on a 1024x1024 grid. I have confirmed that this produces the same counts array as the existing code. Known issues include clamping - things will break if the coordinates lie outside the grid. The most expensive operation is now the sum into counts - this makes a degree of sense as the sums will be into effectively random locations. This can be very slightly improved by sorting uvw but it probably isn't worthwhile.

While I am sure everyone can figure out what I am doing, for the sake of my own memory, this directly computes the index at which the coordinate would appear given the original partitioning. The minus one is a bit of a fudge and can likely be removed by fiddling elsewhere, but I will leave that to your discretion @landmanbester.

JSKenyon commented 3 years ago

I was mistaken - you would think I would have learned my lesson by now, but never trust you know where the bottleneck is when using compiled code. Here is another revision which is now ~500x faster than the original:

@njit(nogil=True, fastmath=True, cache=True)
def compute_wsums_band(uvw, weights, freqs, nx, ny, cell_size_x, cell_size_y, 
                       dtype):
    # get u coordinates of the grid 
    umax = 1.0/cell_size_x
    # get end points
    ug = np.linspace(-umax, umax, nx + 1)[1:]
    u_diff = ug[1] - ug[0]
    # get v coordinates of the grid
    vmax = 1.0/cell_size_y
    # get end points
    vg = np.linspace(-vmax, vmax, ny + 1)[1:]
    v_diff = vg[1] - vg[0]
    # initialise array to store counts
    counts = np.zeros((nx, ny), dtype=dtype)
    # accumulate counts
    nchan = freqs.size
    nrow = uvw.shape[0]

    normfreqs = freqs / 3e8  # Can be replaced with more accurate value.

    for r in range(nrow):
        uvw_row = uvw[r]
        for c in range(nchan):
            # get current uv coords
            chan_normfreq = normfreqs[c]
            u_tmp = uvw_row[0] * chan_normfreq
            v_tmp = uvw_row[1] * chan_normfreq
            # get u index
            u_idx = int(np.floor_divide(u_tmp + umax, u_diff)) - 1
            # get v index
            v_idx = int(np.floor_divide(v_tmp + vmax, v_diff)) - 1
            counts[u_idx, v_idx] += weights[r, c]
    return counts

Note that your existing code is actually faster than mine for very small problem sizes (<<1000 rows), but scales horribly with nrow due to the array comparison. This approach doesn't do any more work per iteration as nrow grows, so should scale out nicely. Anyhow, think ~500x is the point when I am willing to say good enough.

sjperkins commented 3 years ago
for r in range(nrow):
    uvw_row = uvw[r]
    for c in range(nchan):
        # get current uv coords
        chan_normfreq = normfreqs[c]
        u_tmp = uvw_row[0] * chan_normfreq
        v_tmp = uvw_row[1] * chan_normfreq
        # get u index
        u_idx = int(np.floor_divide(u_tmp + umax, u_diff)) - 1
        # get v index
        v_idx = int(np.floor_divide(v_tmp + vmax, v_diff)) - 1
        counts[u_idx, v_idx] += weights[r, c]
return counts

The order of magnitude comes from:

  1. moving chan_normfreq into a register
  2. using np.floor_divide instead of //
  3. anything else?
landmanbester commented 3 years ago

Woah, ok. Thanks @JSKenyon, let me have a go with that. It does not have to be super optimal since I only have to compute the weights at the outset. I was more worried about the correctness of it all. @bennahugo, from what I understand about uniform weighting, you are supposed to accumulate the weights in each Fourier cell and then use the inverse to weight each data point. Here I just find the grid point corresponding to uv by checking what the last grid point is that is less than uv. Is that not correct? What is the difference between uniform weighting and super uniform weighting again?

JSKenyon commented 3 years ago

The order of magnitude comes from:

1. moving `chan_normfreq` into a register

2. using `np.floor_divide` instead of  //

3. anything else?

Yeah, chan_normfreq gets you roughly a factor of two (you avoid looking it up twice). Using np.floor_divide seems to be ~5x faster than normal integer division. My best guess for that is that the numpy routine is better optimised, but I would need to dig into the lower level stuff to really know. You can also shave time off both implementations by doing export OPENBLAS_NUM_THREADS=1, but that is overkill. Any additional speed up comes from me changing they way the inner loop works.

The approach should be relatively straightforward to apply to the other kernels, but shout if you need help @landmanbester. I should reiterate that I just matched what was there - I don't know enough about gridding to comment on correctness.

sjperkins commented 3 years ago

Using np.floor_divide seems to be ~5x faster than normal integer division.

This is useful to know!

o-smirnov commented 3 years ago

Here I just find the grid point corresponding to uv by checking what the last grid point is that is less than uv. Is that not correct?

Shouldn't you be rounding properly?

What is the difference between uniform weighting and super uniform weighting again?

In superuniform you make the grid cells smaller (or larger) by a factor of N. It's like computing uniform weighting for an image that is larger (or smaller) by a factor N.

landmanbester commented 3 years ago

Shouldn't you be rounding properly?

Well I think that is what @bennahugo also suggested but I am not sure what you mean by rounding. Am I not just supposed to accumulate the weights of all uv coordinates falling in a cell?

o-smirnov commented 3 years ago

Yeah true, you need to divide by cellsize and simply round down. I was confused for a moment.

landmanbester commented 3 years ago

Should be working now, will appreciate if someone can give it a test drive

landmanbester commented 1 year ago

This went in a while ago. Please reopen as/if you encounter issues @bennahugo