jennykim1016 / OM10

Tools for working with the Oguri & Marshall (2010) mock catalog of strong gravitational lenses
MIT License
0 stars 0 forks source link

Improve the speed #2

Closed jennykim1016 closed 7 years ago

jennykim1016 commented 7 years ago

The speed of the paint method in db.py is too slow. Mac does not seem to handle it well.

Algorithm wise, maybe I could try reducing bigO by using some efficient data structure. I am not sure how to reduce more because the code should at least run N times.

The other thing I could try is multi-threading, but creating N threads would not optimal. Maybe think of putting every data into a set of sets where each set has <100 elements, and start threading?

drphilmarshall commented 7 years ago

The first thing I would check is how much time the code is spending in each function, and especially how the SED is passed around. I suspect it is reading the SED in new each time. Better would be to pass an SED object of some kind, that was already in memory. Can you check the lenspop code and figure out a workaround?

jennykim1016 commented 7 years ago

I ran the paint method after pulling the filterFromFile method out of the for loop, but this did not improve the speed significantly. The problem seems to lie with RF_Gmag_app = tools.ABFilterMagnitude(Gfilter, sed, redshift) + offset + d.distance_modulus(redshift) line when tested with time package; it takes 3 seconds to calculate the GFilterMagnitude using SED. We are not able to pull this out of the for loop because each lens has different redshift value. I will think more about it!

mbaumer commented 7 years ago

Hi guys, I ran a profile (timing each of the subcalls invoked by a piece of code to find the rate-limiting step) of the db.paint() code (see the relevant parts below). The first column shows the number of times a subroutine was called, and the second shows the total amount of time spent on that subroutine by db.paint().

I believe it shows that @jennykim1016 was right to point out the line she did above, but it's not the call to tools.ABFilterMagnitude() that takes up that much time (.189s) It looks like most of the time is spent calculating distance-related quantities (~2s).

I think we could speed up this problem using a lookup table, that is, by pre-computing the distance modulus for a bunch of thinly-spaced redshift values, and then for each iteration of the lens-painting, rather than going through all that again, we just look up the appropriate pre-computed value!

It's also possible that astropy.cosmology already implements a scheme like this, since its distance computations seem faster than the one Tom implemented by hand in distances.py.


   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    5.393    5.393 db.py:280(calculate_rest_frame_r_magnitude)
        1    0.000    0.000    5.510    5.510 db.py:322(paint)
      405    0.003    0.000    0.033    0.000 distances.py:109(distance_modulus)
        2    0.000    0.000    0.000    0.000 distances.py:16(__init__)
    73620    0.550    0.000    3.817    0.000 distances.py:50(comoving_distance)
  2190216    1.904    0.000    1.904    0.000 distances.py:66(<lambda>)
    20905    0.049    0.000    0.993    0.000 distances.py:73(comoving_transverse_distance)
    20501    0.030    0.000    0.995    0.000 distances.py:87(angular_diameter_distance)
      404    0.001    0.000    0.029    0.000 distances.py:92(luminosity_distance)
      401    0.004    0.000    3.333    0.008 distances.py:95(comoving_volume)
    52715    0.201    0.000    3.073    0.000 distances.py:99(<lambda>)
     5257    0.147    0.000    0.257    0.000 fitpack.py:318(splrep)
     3017    0.028    0.000    0.084    0.000 fitpack.py:533(splev)
     5034    0.024    0.000    0.037    0.000 fitpack.py:616(splint)
        2    0.026    0.013    0.026    0.013 fitpack2.py:1137(__init__)

    73620    0.106    0.000    3.103    0.000 quadpack.py:358(_quad)
    73620    0.151    0.000    3.267    0.000 quadpack.py:42(quad)
     3504    0.100    0.000    3.195    0.001 quadrature.py:111(vfunc)
     3103    0.031    0.000    3.270    0.001 quadrature.py:537(_difftrap)
    10674    0.011    0.000    0.011    0.000 quadrature.py:563(_romberg_diff)
      401    0.040    0.000    3.329    0.008 quadrature.py:589(romberg)
      401    0.000    0.000    0.000    0.000 quadrature.py:82(vectorize1)

     2416    0.189    0.000    0.501    0.000 tools.py:90(ABFilterMagnitude)

    73620    1.093    0.000    2.997    0.000 {scipy.integrate._quadpack._qagse}
     3017    0.030    0.000    0.030    0.000 {scipy.interpolate._fitpack._spl_}
     5034    0.013    0.000    0.013    0.000 {scipy.interpolate._fitpack._splint}
drphilmarshall commented 7 years ago

Thanks Mike! Tom almost certainly implemented some sort of lookup table solution in his lenspop script. Jenny, can you see if you can find it please?

On Feb 2, 2017 15:53, "Mike Baumer" notifications@github.com wrote:

Hi guys, I ran a profile (timing each of the subcalls invoked by a piece of code to find the rate-limiting step) of the db.paint() code (see the relevant parts below). The first column shows the number of times a subroutine was called, and the second shows the total amount of time spent on that subroutine by db.paint().

I believe it shows that @jennykim1016 https://github.com/jennykim1016 was right to point out the line she did above, but it's not the call to tools.ABFilterMagnitude() that takes up that much time (.189s) It looks like most of the time is spent calculating distance-related quantities (~2s).

I think we could speed up this problem using a lookup table, that is, by pre-computing the distance modulus for a bunch of thinly-spaced redshift values, and then for each iteration of the lens-painting, rather than going through all that again, we just look up the appropriate pre-computed value!

It's also possible that astropy.cosmology already implements a scheme like this, since its distance computations seem faster than the one Tom implemented by hand in distances.py.

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 5.393 5.393 db.py:280(calculate_rest_frame_r_magnitude) 1 0.000 0.000 5.510 5.510 db.py:322(paint) 405 0.003 0.000 0.033 0.000 distances.py:109(distance_modulus) 2 0.000 0.000 0.000 0.000 distances.py:16(init) 73620 0.550 0.000 3.817 0.000 distances.py:50(comoving_distance) 2190216 1.904 0.000 1.904 0.000 distances.py:66() 20905 0.049 0.000 0.993 0.000 distances.py:73(comoving_transverse_distance) 20501 0.030 0.000 0.995 0.000 distances.py:87(angular_diameter_distance) 404 0.001 0.000 0.029 0.000 distances.py:92(luminosity_distance) 401 0.004 0.000 3.333 0.008 distances.py:95(comoving_volume) 52715 0.201 0.000 3.073 0.000 distances.py:99() 5257 0.147 0.000 0.257 0.000 fitpack.py:318(splrep) 3017 0.028 0.000 0.084 0.000 fitpack.py:533(splev) 5034 0.024 0.000 0.037 0.000 fitpack.py:616(splint) 2 0.026 0.013 0.026 0.013 fitpack2.py:1137(init)

73620    0.106    0.000    3.103    0.000 quadpack.py:358(_quad)
73620    0.151    0.000    3.267    0.000 quadpack.py:42(quad)
 3504    0.100    0.000    3.195    0.001 quadrature.py:111(vfunc)
 3103    0.031    0.000    3.270    0.001 quadrature.py:537(_difftrap)
10674    0.011    0.000    0.011    0.000 quadrature.py:563(_romberg_diff)
  401    0.040    0.000    3.329    0.008 quadrature.py:589(romberg)
  401    0.000    0.000    0.000    0.000 quadrature.py:82(vectorize1)

 2416    0.189    0.000    0.501    0.000 tools.py:90(ABFilterMagnitude)

73620    1.093    0.000    2.997    0.000 {scipy.integrate._quadpack._qagse}
 3017    0.030    0.000    0.030    0.000 {scipy.interpolate._fitpack._spl_}
 5034    0.013    0.000    0.013    0.000 {scipy.interpolate._fitpack._splint}

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jennykim1016/OM10/issues/2#issuecomment-277123051, or mute the thread https://github.com/notifications/unsubscribe-auth/AArY95ZjdiMFmN9FGz30DJ8kO-np9VF-ks5rYmx6gaJpZM4LtM8j .

jennykim1016 commented 7 years ago

Thank you so much @mbaumer and @drphilmarshall ! I was just about to close this issue, and I realized I have not replied to the messages.

I was not able to find lenspop's lookup table, so I tried to make some mock dictionaries and tried to emulate the lookup table. Speed did not increase significantly.

Then I realized maybe the decomposition might be the issue. If there is an additional function call, it obviously will take more time and memory. Indeed, the low speed turned out to be a problem due to the function decomposition.

The only problem with this is that if we do not decompose the function, it looks really ugly. The screenshot below contains only the quarter of the function, but it looks so intimidating.

screen shot 2017-02-08 at 11 04 49 pm

It is up to us to choose either functionality or style. I think the rules of thumb is if the function is more than 30 lines then we should decompose, but it would significantly lower the speed. Currently, the non-decomposed paint method take less than 5 seconds to synthetically color 200 lenses, whereas it took more than 10 minutes last time.

In addition to that, @mbaumer pointed out, I made db.py to use functions in astropy.cosmology to save time when calculating the distance modulus. I think this saves ~0.5 seconds when dealing with 200 lenses.

Let me know whether we would be okay with the long, non-decomposed code. I think I will go to sleep right now, so it would be great if we could decide tomorrow.

Thanks!

drphilmarshall commented 7 years ago

Excellent analysis, Jenny. I share your concerns about lack of modularity but that speed-up is too good to forego! Let's stick with the fast, long-form method, and simply issue "Improve modularity of paint method" with milestone "Someday" and keep going. We can chat about why the function takes so much longer when decomposed when we meet. Nice work!