jjd330 commented 1 year ago

What are SVDs and why use them

SVDs are able to approximate a matrix by doing something like an expansion where you can approximate the matrix with 2 column vectors with many terms and a coefficient/weight for each term. SVD is the algorithm to be able to do this decomposition. These column vectors are smaller than the starting matrix, making them easier to work with and take less storage.

For NITRATES to work there needs to be a full DRM for each detector, making a full response a very large array (Ndets, Nphoton_energies, Nenergy_bins). During a search, parts of the response are read from pre-calculated files while other parts are calculated on the fly and are combined into the full large response. This response needs to have a spectra forward folded though it, which is a large amount of operations due to the size of the response array. This calculation accounts for ~30-40% of a search's cpu time.

By using SVD the large response array can be decomposed into an array 2 vectors and their coefficients. By using a small subset of the terms (but enough to be accurate) the forward folding calculation takes way less operations, and is much smaller for memory (for both ram and cpu cache). Using the first 10 terms, the forward folding calculation drops from taking ~41 ms to ~0.5 ms. This would also make the process take up less memory. This would also save cpu time by not having to do the ray tracing the craft, which takes ~1 s for each new position, which takes ~10-20% of the search's cpu time.

The downsides of the SVD is that it is an approximation, so we will need to make sure each one is a good enough approximation with no trouble causing outliers, and that having them all be pre-computed will take a massive amount of storage, ~1-4 TBs.

How to make an SVD approximation of a response

First you will need to calculate the current exact full response, reshape it so that the it's 2D by raveling the photon energy and energy bin dimensions, then run the SVD algorithm using numpy

u,d,v = np.linalg.svd(responses1.reshape(32768,187*9).T, full_matrices=False)

which outputs three arrays,

u the energy column vectors with 1683 terms

d the coefficients for each of the 1683 terms

v the det column vectors with 1683 terms

Using the SVD vectors

To recreate the full response array from the SVD outputs, using the first k terms you can do this,

A_approx2 = u1[:,:k]@np.diag(d1[:k])@v1[:k,:]

Recreating the full response can be handy to check how well it matches up to the original response, but the idea is to not have to make the full response again during the search. To do the foward folding for the first k terms you can do this,

u_k = np.reshape(u[:,:k], (187,9,k)) u_ = np.sum(u_k*photon_fluxes[:,np.newaxis,np.newaxis], axis=0) rate_dpis = u_@np.diag(d[:k])@v[:k,:]

This is much faster than using the full response.

Creating SVD vectors for the search

For inFoV responses there will probably need to be an SVD fit for each position that there is a forward ray tracing, every 0.002 in image units, which is a lot. This will most likely require a computer cluster to perform.

For out of FoV responses they can be made with a larger spacing.

Also the example notebook does the whole response as one, but it would be better to instead do separate SVDs for the direct and indirect responses.

Checking the quality of SVD vectors

To tell how good of a fit the SVD does and how many terms are needed will take some experimenting. It will take a lot of manually looking at results at first at very different response types. You will need to check for outliers and bad values, such as negative values (an effective area can't be negative). There can also be a general quality check that will run on all of them. In the end what matters is conserving the rate expectations and LLH you find. So, you can create a test that checks both of these for several spectra. For the LLH you can check the difference between using the full response and the SVD one for a weak-ish signal over a typical background and have a threshold of somewhere around 0.01-0.1 difference needed to be good enough.

Using the SVD vectors in the search

The response framework will need to be completely reworked. Instead of doing everything it currently does, the SVD vectors will need to be read in and probably needed to be interpolated between (which also needs to be tested how well the interpolation works). Then the new forward folding calculation will also need to be added. It can also be tested what the new best order of operations will be (interpolate between SVD vectors or do forward folding of them all first then interpolate between those?, etc. )

jjd330 commented 1 year ago

The attached notebook (compressed because Github wouldn't let me upload it otherwise) is a messy exploration of the responses and using the SVDs. It goes through using the SVD algorithm, doing the forward folding, comparing the SVD result to the actual, and looking at LLH differences.
The responses used in the notebook can be found here https://alabama.box.com/s/kk71xugeuyx0yfsyz360dmgadnuic329

UsingSVDresponses.zip

Tohuvavohu commented 1 year ago

Thanks for posting this Jimmy! A few quick questions: 1) How are you estimating the amount of storage required? Is this assuming we keep the first 10 SVD coefficients? Shouldn't this be smaller than precomputing and storing all of the responses?

2) Is it still accurate that response generation and forward folding takes up 50% of compute time, even after the new model+likelihood speed ups?

3) Why do you think it's better to still separate the response into direct+indirect and do an SVD fit for each?

jjd330 commented 1 year ago

Yes, it's smaller than precomputing all responses, but that's not what we currently do. I put a large range due to not knowing how many terms we're going to need. I think with 10 it was somehow around 2 TB when I estimated it, but that wasn't an exact calculation.
This after the LLH speed ups. That's an approximate for just the forward folding. There's a little more in there to do when getting the count expectations for a new set of spectral parameters, but it's dominated by the forward folding function. The % of cpu time also depends on how many times windows are being done since the forward folding isn't repeated for multiple times windows, so it takes up less % of time when there's more time windows.
I should probably say should be separated, since they have different errors so for the current setup would need to be separate to have those separate errors. It would probably be easier to fit the SVD vectors with them desperate though.

Tohuvavohu commented 1 year ago

1) right. Is there a way to insure we won't be dominated by IO if we do this?

2)right. We should try to have a standard of talking about speed per time bin, which I think is what you did

3)desperate?

On Tue, May 2, 2023, 8:39 AM Jimmy DeLaunay @.***> wrote:

1.

Yes, it's smaller than precomputing all responses, but that's not what we currently do. I put a large range due to not knowing how many terms we're going to need. I think with 10 it was somehow around 2 TB when I estimated it, but that wasn't an exact calculation. 2.

This after the LLH speed ups. That's an approximate for just the forward folding. There's a little more in there to do when getting the count expectations for a new set of spectral parameters, but it's dominated by the forward folding function. The % of cpu time also depends on how many times windows are being done since the forward folding isn't repeated for multiple times windows, so it takes up less % of time when there's more time windows. 3.

I should probably say should be separated, since they have different errors so for the current setup would need to be separate to have those separate errors. It would probably be easier to fit the SVD vectors with them desperate though.

— Reply to this email directly, view it on GitHub https://github.com/Swift-BAT/UnmaskedResponder/issues/6, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE2SEU33RBGGCWE4AZJCWLLXEETCHANCNFSM6AAAAAAXSGOXEA . You are receiving this because you commented.Message ID: @.***>

jjd330 commented 1 year ago

We'll have to test it.
Well it's not fully linear with the number of time bins. The LLH minimization is but the response part isn't since it isn't done for each time bin.
Separate*. Stupid phone haha

Swift-BAT / UnmaskedResponder

Using SVDs to save full DRMs #6

What are SVDs and why use them

How to make an SVD approximation of a response

Using the SVD vectors

Creating SVD vectors for the search

Checking the quality of SVD vectors

Using the SVD vectors in the search