Filing this ticket based on an email thread with Timothy Brandt, and pasting below an excerpt of that conversation:
As noted during the meeting, NSClean takes around 1 minute to run on a rate file on a good laptop. Of this, about 35 seconds (on my laptop) are spent constructing the mask, and about 20 seconds are spent fitting the background. The mask construction only needs to happen once if we are applying this correction at the group level (we would just need to add a mask for jumps at the group level, but we would already know where those are). Then, for the background fitting, it is possible to make three tiny changes that are fully mathematically equivalent but that together gain a factor of about 15 in speedup over the current implementation. They are:
(use the associativity of matrix multiplication to multiply two vectors rather than two matrices)
This is because p is diagonal, so there is no reason to perform a full multiplication by the giant matrix, or even to create such a giant matrix in the first place.
With these changes, the computed background is identical but the computational cost for the background fit is just over 1 second on my laptop. At that point I think it is no longer any problem to apply NSClean at the group level. (DRL editorial note- this last is a potential future change).
Issue JP-3653 was created on JIRA by David Law:
Filing this ticket based on an email thread with Timothy Brandt, and pasting below an excerpt of that conversation:
As noted during the meeting, NSClean takes around 1 minute to run on a rate file on a good laptop. Of this, about 35 seconds (on my laptop) are spent constructing the mask, and about 20 seconds are spent fitting the background. The mask construction only needs to happen once if we are applying this correction at the group level (we would just need to add a mask for jumps at the group level, but we would already know where those are). Then, for the background fitting, it is possible to make three tiny changes that are fully mathematically equivalent but that together gain a factor of about 15 in speedup over the current implementation. They are:
https://github.com/spacetelescope/jwst/blob/4fcb915efa12b42ad00c6ad89cbc4ed3928195e5/jwst/nsclean/lib.py#L137
Change
p = np.diag(self.P[y][self.mask[y]]) # Weights
to
p = self.P[y][self.mask[y]] # Weights
(i.e. don’t construct a huge matrix with almost all zeros)
https://github.com/spacetelescope/jwst/blob/4fcb915efa12b42ad00c6ad89cbc4ed3928195e5/jwst/nsclean/lib.py#L167
Change
A = np.matmul(p, B)
to
A = B*p[:, np.newaxis]
(i.e. don’t do the full multiplication by a huge diagonal matrix)
https://github.com/spacetelescope/jwst/blob/4fcb915efa12b42ad00c6ad89cbc4ed3928195e5/jwst/nsclean/lib.py#L175
Change
rfft[:k.shape[1]] = np.matmul(np.matmul(pinv_PB, p), d)
to
rfft[:k.shape[1]] = np.matmul(pinv_PB, p*d)
(use the associativity of matrix multiplication to multiply two vectors rather than two matrices)
This is because p is diagonal, so there is no reason to perform a full multiplication by the giant matrix, or even to create such a giant matrix in the first place.
With these changes, the computed background is identical but the computational cost for the background fit is just over 1 second on my laptop. At that point I think it is no longer any problem to apply NSClean at the group level. (DRL editorial note- this last is a potential future change).