Closed dkirkby closed 7 years ago
I have implemented a simple benchmark in a new cmd-line script quickfiberloss
that calculates fiberloss fractions (vs wavelength) for 5K targets randomly placed on the focal plane. The basic flow is:
simulator = specsim.simulator.Simulator(args.config)
for i in range(num_targets):
fiberloss = specsim.fiberloss.calculate_fiber_acceptance_fraction(
focal_x[i], focal_y[i], simulator.simulated['wavelength'],
simulator.source, simulator.atmosphere, simulator.instrument)
I get the following timing results for the loop (i.e., not including the simulator start up time):
In other words, this is surprisingly fast, with a total time for 5K targets of ~70ms!
Those timing results were completely wrong because the code was still using the fiberloss tables, instead of galsim. The updated numbers look a lot more plausible:
The total time for 5K targets is now ~230s (3:50 minutes).
This is without any caching of things that do not change (atmospheric PSF and anti-aliased fiber aperture, for example), so the next steps are to:
The profile is instructive:
ncalls tottime percall cumtime percall filename:lineno(function)
11003 11.441 0.001 20.700 0.002 astropy/units/core.py:1212(filter_units)
11000 11.206 0.001 13.754 0.001 galsim/base.py:757(drawImage)
42977468/42977467 5.847 0.000 6.900 0.000 {isinstance}
1166300 1.847 0.000 2.598 0.000 astropy/units/core.py:1186(has_bases_in_common)
168248 1.508 0.000 10.626 0.000 astropy/units/quantity.py:291(__array_prepare__)
5581525 1.505 0.000 1.505 0.000 {method 'match' of '_sre.SRE_Pattern' objects}
250 1.346 0.005 3.970 0.016 astropy/io/ascii/core.py:683(process_lines)
1000 1.326 0.001 64.245 0.064 specsim/fiberloss.py:18(calculate_fiber_acceptance_fraction)
366521 1.279 0.000 3.765 0.000 astropy/units/quantity.py:567(_new_view)
2412895/2409763 1.194 0.000 1.336 0.000 {getattr}
It looks like we are spending more time in astropy.units
than galsim.base
...
Hardcode the following values:
# Lookup the RMS blur in focal-plane microns.
##blur_rms = instrument.get_blur_rms(wlen, angle_r).to(u.um).value
blur_rms = 10.0
...
# Lookup the seeing FWHM in arcsecs.
##seeing_fwhm = atmosphere.get_seeing_fwhm(wlen).to(u.arcsec).value
seeing_fwhm = 1.1
and repeat the profile:
ncalls tottime percall cumtime percall filename:lineno(function)
11000 10.614 0.001 12.991 0.001 galsim/base.py:757(drawImage)
5581525 1.691 0.000 1.691 0.000 {method 'match' of '_sre.SRE_Pattern' objects}
250 1.493 0.006 4.409 0.018 astropy/io/ascii/core.py:683(process_lines)
3275381 1.285 0.000 1.883 0.000 astropy/io/ascii/core.py:698(<genexpr>)
1000 1.171 0.001 33.100 0.033 specsim/fiberloss.py:18(calculate_fiber_acceptance_fraction)
135248 1.143 0.000 8.634 0.000 astropy/units/quantity.py:291(__array_prepare__)
3 1.023 0.341 1.538 0.513 specsim/camera.py:68(__init__)
2397468/2397467 0.883 0.000 1.773 0.000 {isinstance}
267521 0.876 0.000 2.518 0.000 astropy/units/quantity.py:567(_new_view)
1840895/1837763 0.863 0.000 0.983 0.000 {getattr}
This confirms that the astropy slowdown is mostly coming from those two calls.
Instructions for profiling a console-script entry point within ipython:
import cProfile
import specsim.quickfiberloss
cProfile.run('specsim.quickfiberloss.main(["-n", "1000"])', 'profile.out')
To examine the profile results saved in profile.out
:
import pstats
p = pstats.Stats('profile.out')
p.sort_stats('time').print_stats(10)
After the latest commits, the profile is:
ncalls tottime percall cumtime percall filename:lineno(function)
11000 10.589 0.001 12.892 0.001 galsim/base.py:757(drawImage)
11000 10.544 0.001 18.898 0.002 astropy/units/core.py:1212(filter_units)
41524631 4.744 0.000 4.990 0.000 {isinstance}
1166000 1.662 0.000 2.329 0.000 astropy/units/core.py:1186(has_bases_in_common)
84841 0.682 0.000 0.682 0.000 {method 'reduce' of 'numpy.ufunc' objects}
1166824/1160767 0.659 0.000 1.064 0.000 astropy/units/core.py:2097(decompose)
1166000 0.576 0.000 2.905 0.000 astropy/units/core.py:1195(has_bases_in_common_with_equiv)
1000 0.572 0.001 1.205 0.001 specsim/fiberloss.py:35(__init__)
1075686/1020527 0.536 0.000 1.141 0.000 astropy/units/core.py:1934(decompose)
57000 0.527 0.000 0.925 0.000 galsim/transform.py:107(__init__)
so we are still dominated by astropy units.
Checklist for closing this PR:
build_docs -w
test failure.Simulator.simulate()
to accept optional per-fiber arrays of (x,y), fiberloss and SED and return 2D output arrays. See #6 for details.I think I have fixed the last travis problems and I am ready to merge now. I decided to push the new "batch simulate" method into a separate PR since there is already a lot to review here. I am assigning @sbailey, but anyone else is welcome to jump in.
Updated readthedocs for this PR are at http://specsim.readthedocs.io/en/multiplesources/, including:
quickfiberloss
benchmarking script.specsim.fiberloss
module.Did you want to finish the unchecked punchlist item of "Update Simulator.simulate()
to accept optional per-fiber arrays of (x,y), fiberloss and SED and return 2D output arrays. See #6 for details." before closing this, or do you want to merge this and then continue with simulating multiple sources at different focal plane locations in a separate branch/PR?
I don't see how to combine the fiberloss calculations with a spectral simulation here, even for a single target. Could you add a section to the Users Guide with the actual python code to simulate a spectrum of flux vs. wavelength for a r=2 arcsec disk galaxy at x=100,y=50 on the DESI focal plane, and an example of how to override/replace the default fiberloss calculation with a different fiberloss vs. wavelength?
i.e. this code looks fine to merge, though I don't yet know how to use it to accomplish the original goal of this branch / PR. I'm not sure if that is because this PR lays the groundwork but the final integration is still to be done, or whether I'm just not reading the right part of the documentation/code, or whether that connection isn't documented yet.
You didn't miss anything. I decided to do the final connection in a new PR since it will probably break the public API and there are a lot of changes here already.
To simulate a single fiber with this code, you just need to change instrument.fiberloss.method
from table
to galsim
(either by editing the yaml file or programmatically). But the new API will bypass the single-fiber mode to calculate output arrays for all fibers with one call.
If using method=galsim, how does the user specify where on the focal plane the object is?
For consideration: is it necessary to change the public specsim API to support multiple fibers simultaneously? Are there big performance benefits from reusing calculations? We could just wrap the existing single spectrum API in a multi-spectrum API on the desisim side. I think updating the specsim API could be useful, but I don't think it has to happen that way.
Documentation comment doesn't have to be a blocking factor for this PR, but it still applies: specsim provides great details on what it does, but lacks the big picture example of how to put the pieces together to do a simulation (unless using the command line quickspecsim). You may be so familiar with it that it is obvious to you, but I have to poke under the hood in quickspecsim or quickgen to figure that out.
I am merging now and will update the docs as part of the next PR after consultation with Stephen.
The scipy docs web server has been down for ~2 days now, so this may cause travis failures (since the docs build uses intersphinx).
This PR is to address #6 and, in particular, to look at the feasibility of calculating fiberloss fractions for a whole focal plane on the fly.