PAHFIT / pahfit

Model Decomposition for Near- to Mid-Infrared Spectroscopy of Astronomical Sources
https://pahfit.readthedocs.io/
18 stars 26 forks source link

IDL vs. Python Speed and Convergence #178

Closed jdtsmith closed 1 year ago

jdtsmith commented 2 years ago

We need to revisit the convergence criteria and/or default step size. Working on my IDL classic demo for tomorrow, I’m reminded that typical number of iterations for the MPFIT IDL-based PAHFIT is 150. Sometimes as few as 90. The master version will happily keep iterating up to 5000 or more if you allow it. And it rarely converges before 500. That will be a significant speedup.

jdtsmith commented 2 years ago

OK my assessment is mistaken. The "iterations" that MPFIT reports include many function evaluations per iteration, and are not equivalent to astropy.modeling "iterations" (which is really just translated to max_fev in scipy.leastsq.

An example fit was 71 iterations, 6754 function evaluations, i.e. close to 100 evals per iteration (!). The same in modern PAHFIT used 814 function evaluations (though was slower :(). So a better re-statement of the concern is: IDL PAHFIT evaluates its model function 14x faster than Python PAHFIT at present.

jdtsmith commented 2 years ago

When fitting Thomas' example spectrum using the Akari+Spitzer pack, performance and FEV's both get quite a bit worse:

karllark commented 2 years ago

I thought we went through all this previously and found that the python version was faster. What changed?

jdtsmith commented 2 years ago

Yeah I know Thomas mentioned that, but it has never been my experience. I do find they're pretty "comparable" with the IDL classic sci pack (~5s per fit for either). That of course is way too slow for JWST spectra, which will have 5-20x wavelength points, and 3-5x model parameters (??). But at least not a regression.

What I noticed during my Fri demo was that the new Akari pack is much slower to converge than the classic, at least with the test spectrum under data/. This isn't exactly apples and oranges, since the Akari test spectrum is a noiseless model with "fake" uncertainties. @ThomasSYLai maybe we could prepare a fully symmetrical case, where the only difference is the addition of Akari data?

I'm compiling some thoughts about speed for discussion tomorrow. No sense prematurely optimizing but should keep on our radar, especially runtime growth vs. model/data complexity.

jdtsmith commented 2 years ago

In case people want to glance through, I just did a profiler run on the 60s Akari fit (which obviously takes much longer when profiling all those function calls...):

The relative error between two consecutive iterates is at most 0.000000

         97157192 function calls (92883870 primitive calls) in 300.610 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   319754   15.716    0.000   41.637    0.000 core.py:2661(_param_sets)
2452133/54799   15.553    0.000   33.474    0.001 copy.py:128(deepcopy)
     9139   13.758    0.002   27.050    0.003 core.py:2583(_parameters_to_array)
  9318514   12.632    0.000   14.253    0.000 {built-in method builtins.len}
  1557652   12.295    0.000   20.579    0.000 shapes.py:308(check_broadcast)
   319754    8.473    0.000   30.984    0.000 core.py:1863(_prepare_inputs_single_model)
616674/4568    8.167    0.000  224.087    0.049 core.py:1063(__call__)
  6662834    7.856    0.000    7.856    0.000 {method 'get' of 'dict' objects}
  3513494    7.360    0.000    7.360    0.000 {built-in method numpy.array}
  5669481    7.044    0.000    7.768    0.000 {built-in method builtins.isinstance}
     4569    6.425    0.001   74.489    0.016 fitting.py:1601(_fitter_to_model_params)
  4756073    6.350    0.000    6.351    0.000 {built-in method builtins.getattr}
  5282538    5.999    0.000    5.999    0.000 {built-in method builtins.hasattr}
   150738    5.677    0.000    5.677    0.000 physical_models.py:288(evaluate)
  1329229    5.409    0.000   13.986    0.000 <__array_function__ internals>:177(shape)
  2718455    5.257    0.000    5.257    0.000 parameters.py:313(value)
  2367041    5.233    0.000    7.855    0.000 core.py:1221(sync_constraints)
1443411/1438843    5.171    0.000    8.677    0.000 {built-in method numpy.core._multiarray_umath.implement_array_function}
    50248    4.837    0.000    4.837    0.000 component_models.py:20(evaluate)
   616674    4.797    0.000   10.468    0.000 core.py:1084(_get_renamed_inputs_as_positional)
   319754    4.781    0.000   54.265    0.000 core.py:1995(prepare_inputs)
   100496    4.621    0.000    4.621    0.000 functional_models.py:163(evaluate)
  3513913    4.354    0.000    4.354    0.000 {method 'append' of 'list' objects}
  3217061    3.955    0.000    3.955    0.000 __init__.py:1053(__getitem__)
  1005181    3.826    0.000    8.305    0.000 core.py:1248(bounds)
     4570    3.709    0.001   29.978    0.007 fitting.py:1664(_model_to_fit_params)
   845265    3.643    0.000    6.040    0.000 parameters.py:333(value)
  2772698    3.625    0.000    3.625    0.000 shapes.py:332(<genexpr>)
616674/4568    3.569    0.000  223.858    0.049 core.py:1030(_generic_evaluate)
  2863846    3.190    0.000    3.190    0.000 {built-in method builtins.id}
  1233348    3.172    0.000    4.554    0.000 core.py:942(get_bounding_box)
   854590    3.047    0.000    6.836    0.000 core.py:1239(fixed)
58946/54799    3.035    0.000   28.434    0.001 copy.py:226(_deepcopy_dict)
     4569    2.809    0.001    9.872    0.002 core.py:2598(_array_to_parameters)
   319754    2.549    0.000   19.255    0.000 core.py:1045(_post_evaluate)
296920/4568    2.521    0.000  223.787    0.049 core.py:3220(_evaluate)
  2205573    2.405    0.000    2.405    0.000 copy.py:182(_deepcopy_atomic)
319754/301488    2.232    0.000   74.629    0.000 core.py:937(evaluate)
  1704417    2.208    0.000    2.208    0.000 {method 'ravel' of 'numpy.ndarray' objects}
  1329229    2.168    0.000    2.168    0.000 fromnumeric.py:1965(shape)
   616674    2.153    0.000    3.012    0.000 core.py:1085(_keyword2positional)
   319754    2.083    0.000   97.984    0.000 core.py:926(_pre_evaluate)
   319754    2.072    0.000    8.950    0.000 core.py:1002(_validate_input_shapes)
   886580    2.062    0.000    3.165    0.000 _collections_abc.py:868(__iter__)
  1484529    2.023    0.000    2.023    0.000 {method 'pop' of 'dict' objects}
   319754    1.973    0.000    5.842    0.000 core.py:2152(_prepare_outputs_single_model)
   319754    1.908    0.000    5.928    0.000 core.py:2123(_process_output_units)
   319754    1.905    0.000    9.550    0.000 core.py:2178(prepare_outputs)
  1356833    1.816    0.000    1.816    0.000 parameters.py:346(unit)
   507270    1.776    0.000    4.036    0.000 core.py:1258(tied)
319754/301488    1.765    0.000  199.688    0.001 core.py:376(__call__)
  1329229    1.704    0.000    1.704    0.000 fromnumeric.py:1961(_shape_dispatcher)
  1219633    1.622    0.000    1.622    0.000 core.py:876(__len__)
   319754    1.579    0.000    2.639    0.000 core.py:2014(<listcomp>)
   899878    1.507    0.000    1.507    0.000 parameters.py:405(shape)
   319754    1.490    0.000    3.909    0.000 core.py:2032(_validate_input_units)
319754/301488    1.482    0.000  201.076    0.001 core.py:398(__call__)
   338034    1.464    0.000    4.636    0.000 {built-in method builtins.any}
   278648    1.433    0.000    1.433    0.000 {built-in method _operator.add}
   296920    1.285    0.000    3.167    0.000 core.py:4017(binary_operation)
   899878    1.285    0.000    1.285    0.000 parameters.py:373(internal_unit)
296920/4568    1.057    0.000  223.807    0.049 core.py:3197(evaluate)
   319754    1.049    0.000    4.543    0.000 core.py:972(_validate_input_shape)
   319754    1.047    0.000    1.940    0.000 core.py:2136(_prepare_output_single_model)
   890760    1.038    0.000    1.038    0.000 core.py:3369(n_inputs)
   319754    1.015    0.000    1.655    0.000 core.py:2015(<listcomp>)
   296920    1.015    0.000    4.183    0.000 core.py:3078(_apply_operators_to_value_lists)
   338026    1.004    0.000    1.004    0.000 {built-in method numpy.empty}
   593825    1.000    0.000    1.000    0.000 {built-in method builtins.max}
   328888    0.917    0.000    0.917    0.000 {method 'reshape' of 'numpy.ndarray' objects}
   319754    0.912    0.000    1.334    0.000 core.py:2124(<listcomp>)
   296920    0.901    0.000    2.019    0.000 core.py:3207(_post_evaluate)
   169016    0.899    0.000    1.566    0.000 core.py:1843(return_units)
   502590    0.871    0.000    0.871    0.000 {built-in method _functools.reduce}
   616674    0.858    0.000    0.858    0.000 {method 'keys' of 'dict' objects}
    18266    0.822    0.000    6.523    0.000 core.py:2380(_initialize_parameters)
58356/54799    0.792    0.000   31.551    0.001 copy.py:258(_reconstruct)
   379124    0.777    0.000    0.777    0.000 {built-in method numpy.asanyarray}
    18266    0.773    0.000   45.040    0.002 core.py:699(__init__)
     4567    0.734    0.000   16.806    0.004 component_models.py:63(kvt)
   173725    0.723    0.000    1.135    0.000 copy.py:242(_keep_alive)
182667/182665    0.684    0.000    1.884    0.000 core.py:883(__setattr__)
    13704    0.656    0.000   41.134    0.003 component_models.py:123(evaluate)
    54798    0.547    0.000    1.014    0.000 core.py:2523(_initialize_parameter_value)
   356286    0.516    0.000    0.516    0.000 core.py:755(inputs)
    54798    0.516    0.000    1.108    0.000 utils.py:290(quantity_asanyarray)
    18266    0.500    0.000    2.084    0.000 core.py:2566(_initialize_slices)
   319754    0.484    0.000    0.484    0.000 core.py:1152(model_set_axis)
   296920    0.480    0.000    0.480    0.000 core.py:3176(_pre_evaluate)
   319754    0.476    0.000    0.476    0.000 core.py:1140(name)
    68520    0.465    0.000    0.808    0.000 core.py:1819(input_units)
   182950    0.463    0.000    0.724    0.000 abc.py:117(__instancecheck__)
    18266    0.457    0.000    1.709    0.000 core.py:2606(_check_param_broadcast)
   255240    0.454    0.000    0.825    0.000 {built-in method builtins.issubclass}
   150738    0.441    0.000    0.676    0.000 physical_models.py:322(input_units)
   150738    0.437    0.000    0.666    0.000 physical_models.py:335(return_units)
        1    0.403    0.403  299.059  299.059 {built-in method scipy.optimize._minpack._lmdif}
    65625    0.362    0.000    2.028    0.000 copy.py:209(_deepcopy_tuple)
   100496    0.300    0.000    0.460    0.000 functional_models.py:181(input_units)
   109596    0.292    0.000    0.422    0.000 parameters.py:634(_create_value_wrapper)
    18266    0.268    0.000    0.659    0.000 core.py:2363(_initialize_constraints)
    65625    0.265    0.000    1.589    0.000 copy.py:210(<listcomp>)
   182950    0.261    0.000    0.261    0.000 {built-in method _abc._abc_instancecheck}
    18266    0.254    0.000    0.750    0.000 core.py:815(_initialize_unit_support)
   205077    0.252    0.000    0.252    0.000 {method 'items' of 'dict' objects}
   123058    0.250    0.000    0.956    0.000 copy.py:263(<genexpr>)
    54798    0.249    0.000    0.671    0.000 parameters.py:607(model)
    91330    0.234    0.000    0.371    0.000 abc.py:121(__subclasscheck__)
    27401    0.227    0.000    0.227    0.000 {method 'reduce' of 'numpy.ufunc' objects}
    54798    0.220    0.000    0.580    0.000 <__array_function__ internals>:177(size)
     4568    0.207    0.000   17.012    0.004 component_models.py:107(evaluate)
    54798    0.176    0.000    0.267    0.000 <frozen importlib._bootstrap>:398(parent)
    55117    0.160    0.000    0.247    0.000 copyreg.py:94(__newobj__)
    18266    0.159    0.000    0.579    0.000 fromnumeric.py:69(_wrapreduction)
    55114    0.145    0.000    0.145    0.000 {method '__deepcopy__' of 'numpy.ndarray' objects}
     4567    0.141    0.000    1.217    0.000 _interpolate.py:442(__init__)
    91330    0.137    0.000    0.137    0.000 {built-in method _abc._abc_subclasscheck}
    58356    0.133    0.000    0.133    0.000 {method '__reduce_ex__' of 'object' objects}
    55118    0.120    0.000    0.120    0.000 {method 'update' of 'dict' objects}
    18266    0.104    0.000    0.329    0.000 {method 'any' of 'numpy.generic' objects}
    18266    0.101    0.000    0.500    0.000 core.py:724(_default_inputs_outputs)
    18272    0.101    0.000    0.101    0.000 {built-in method _operator.mul}
    18266    0.100    0.000   45.140    0.002 core.py:431(__init__)
    68520    0.095    0.000    0.095    0.000 {method 'copy' of 'dict' objects}
    18266    0.092    0.000    0.946    0.000 physical_models.py:341(x_0)
     9139    0.091    0.000   27.183    0.003 core.py:1177(parameters)
    54798    0.091    0.000    0.091    0.000 {method 'rpartition' of 'str' objects}
    54798    0.091    0.000    0.091    0.000 fromnumeric.py:3206(size)
    27400    0.089    0.000    0.317    0.000 _methods.py:54(_any)
    55117    0.087    0.000    0.087    0.000 {built-in method __new__ of type object at 0x7fafe77cb740}
    27404    0.085    0.000    0.126    0.000 numerictypes.py:282(issubclass_)
        1    0.084    0.084    0.084    0.084 _basic.py:900(inv)
    18266    0.084    0.000   45.224    0.002 core.py:433(__init__)
    13702    0.084    0.000    0.226    0.000 numerictypes.py:356(issubdtype)
    54852    0.083    0.000    0.083    0.000 core.py:3284(param_names)
    18266    0.082    0.000    0.854    0.000 <__array_function__ internals>:177(any)
    27400    0.082    0.000    0.112    0.000 numeric.py:1859(isscalar)
    54798    0.080    0.000    0.080    0.000 {method 'add' of 'set' objects}
     4567    0.079    0.000    0.202    0.000 _interpolate.py:703(_check_bounds)
     4568    0.077    0.000  298.721    0.065 fitting.py:1068(objective_function)
     4567    0.075    0.000    0.277    0.000 numeric.py:1404(moveaxis)
     9134    0.074    0.000    0.152    0.000 numeric.py:1341(normalize_axis_tuple)
    54798    0.074    0.000    0.074    0.000 parameters.py:380(internal_unit)
    18266    0.073    0.000    0.652    0.000 fromnumeric.py:2305(any)
    13701    0.073    0.000    0.073    0.000 {built-in method builtins.min}
    54798    0.070    0.000    0.070    0.000 fromnumeric.py:3202(_size_dispatcher)
    13701    0.069    0.000    0.172    0.000 <__array_function__ internals>:177(concatenate)
     4568    0.059    0.000    0.178    0.000 _util.py:241(_asarray_validated)
    18280    0.059    0.000    0.087    0.000 _collections_abc.py:783(values)
     4567    0.053    0.000    0.149    0.000 _interpolate.py:581(fill_value)
    18266    0.049    0.000    0.072    0.000 core.py:742(_initialize_setters)
    18280    0.048    0.000    0.072    0.000 __init__.py:1066(__iter__)
     4567    0.046    0.000    0.459    0.000 _interpolate.py:688(_evaluate)
    27410    0.044    0.000    0.044    0.000 {built-in method numpy.asarray}
     9137    0.043    0.000    0.109    0.000 fromnumeric.py:51(_wrapfunc)
     4567    0.042    0.000    0.233    0.000 _polyint.py:112(_set_yi)
     4567    0.039    0.000    0.039    0.000 {built-in method numpy.core._multiarray_umath.interp}
    18266    0.036    0.000    0.036    0.000 fromnumeric.py:70(<dictcomp>)
     4567    0.035    0.000    0.374    0.000 _polyint.py:104(_reshape_yi)
     4567    0.034    0.000    0.735    0.000 _polyint.py:56(__call__)
     4567    0.034    0.000    0.053    0.000 _interpolate.py:318(_check_broadcast_up_to)
     9134    0.032    0.000    0.124    0.000 {method 'any' of 'numpy.ndarray' objects}
     4567    0.031    0.000    0.136    0.000 function_base.py:1432(interp)
    18266    0.030    0.000    0.030    0.000 core.py:822(<dictcomp>)
     4568    0.030    0.000    0.052    0.000 fromnumeric.py:1755(ravel)
    18280    0.028    0.000    0.028    0.000 _collections_abc.py:802(__init__)
    18266    0.028    0.000    0.028    0.000 fromnumeric.py:2300(_any_dispatcher)
     4567    0.028    0.000    0.028    0.000 {method 'argsort' of 'numpy.ndarray' objects}
     9134    0.027    0.000    0.041    0.000 numeric.py:1391(<listcomp>)
     4567    0.026    0.000    0.213    0.000 _polyint.py:87(_prepare_x)
     4567    0.026    0.000    0.177    0.000 _polyint.py:132(_set_dtype)
    18266    0.026    0.000    0.026    0.000 core.py:826(<dictcomp>)
     4568    0.026    0.000    0.026    0.000 {method 'take' of 'numpy.ndarray' objects}
     9134    0.025    0.000    0.036    0.000 _interpolate.py:335(_do_extrapolate)
    18280    0.024    0.000    0.024    0.000 {built-in method builtins.iter}
     4567    0.024    0.000    0.123    0.000 <__array_function__ internals>:177(argsort)
     4568    0.022    0.000    0.113    0.000 <__array_function__ internals>:177(take)
     4567    0.022    0.000    0.320    0.000 <__array_function__ internals>:177(moveaxis)
     4568    0.021    0.000    0.097    0.000 <__array_function__ internals>:177(ravel)
     4567    0.021    0.000    0.180    0.000 <__array_function__ internals>:177(interp)
    13701    0.021    0.000    0.021    0.000 multiarray.py:148(concatenate)
     4567    0.020    0.000    0.253    0.000 _polyint.py:49(__init__)
     9138    0.020    0.000    0.059    0.000 fitting.py:1611(<genexpr>)
     4567    0.020    0.000    0.060    0.000 <__array_function__ internals>:177(iscomplexobj)
     4567    0.018    0.000    0.074    0.000 fromnumeric.py:1012(argsort)
     4567    0.017    0.000    0.198    0.000 _interpolate.py:615(_call_linear_np)
     4567    0.017    0.000    0.029    0.000 _polyint.py:93(_finish_y)
     4568    0.016    0.000    0.068    0.000 fromnumeric.py:93(take)
     4567    0.015    0.000    0.021    0.000 type_check.py:303(iscomplexobj)
     9134    0.013    0.000    0.013    0.000 {built-in method _operator.index}
     9134    0.013    0.000    0.013    0.000 {built-in method numpy.core._multiarray_umath.normalize_axis_index}
     4568    0.013    0.000    0.019    0.000 _base.py:1291(isspmatrix)
     4567    0.013    0.000    0.013    0.000 {built-in method builtins.sorted}
     4568    0.012    0.000    0.018    0.000 core.py:6377(isMaskedArray)
     4569    0.012    0.000    0.012    0.000 {method 'transpose' of 'numpy.ndarray' objects}
     4567    0.008    0.000    0.008    0.000 numeric.py:1466(<listcomp>)
     4567    0.007    0.000    0.007    0.000 {method 'insert' of 'list' objects}
     4567    0.007    0.000    0.007    0.000 numeric.py:1400(_moveaxis_dispatcher)
     4568    0.007    0.000    0.007    0.000 fromnumeric.py:1751(_ravel_dispatcher)
     4568    0.007    0.000    0.007    0.000 fromnumeric.py:89(_take_dispatcher)
     4567    0.007    0.000    0.007    0.000 fromnumeric.py:1008(_argsort_dispatcher)
     4567    0.007    0.000    0.007    0.000 function_base.py:1428(_interp_dispatcher)
     4567    0.007    0.000    0.007    0.000 type_check.py:206(_is_type_dispatcher)
      555    0.003    0.000    0.011    0.000 utils.py:396(__setitem__)
      262    0.003    0.000    0.011    0.000 copy.py:200(_deepcopy_list)
      555    0.001    0.000    0.006    0.000 {built-in method builtins.setattr}
        3    0.001    0.000    0.016    0.005 utils.py:387(__init__)
      185    0.001    0.000    0.003    0.000 parameters.py:491(bounds)
       66    0.001    0.000    0.001    0.000 {method '__deepcopy__' of 'numpy.generic' objects}
        3    0.001    0.000    0.012    0.004 _collections_abc.py:933(update)
      555    0.001    0.000    0.001    0.000 __init__.py:1060(__setitem__)
      195    0.001    0.000    0.001    0.000 core.py:3296(__getattr__)
        1    0.001    0.001    0.001    0.001 core.py:1668(_has_units)
      185    0.000    0.000    0.001    0.000 parameters.py:460(fixed)
      185    0.000    0.000    0.001    0.000 parameters.py:477(tied)
        1    0.000    0.000    1.347    1.347 core.py:2186(copy)
      185    0.000    0.000    0.000    0.000 parameters.py:467(tied)
      185    0.000    0.000    0.000    0.000 parameters.py:485(bounds)
      185    0.000    0.000    0.000    0.000 parameters.py:453(fixed)
      185    0.000    0.000    0.000    0.000 {built-in method builtins.callable}
        1    0.000    0.000  299.209  299.209 _minpack_py.py:279(leastsq)
        1    0.000    0.000  300.610  300.610 helpers.py:105(fit_spectrum)
        1    0.000    0.000  300.609  300.609 fitting.py:1103(__call__)
        2    0.000    0.000    0.000    0.000 iostream.py:508(write)
        1    0.000    0.000  300.610  300.610 {built-in method builtins.exec}
        6    0.000    0.000    0.000    0.000 numeric.py:289(full)
        1    0.000    0.000    0.000    0.000 socket.py:480(send)
        1    0.000    0.000    0.017    0.017 fitting.py:1687(_validate_constraints)
        6    0.000    0.000    0.000    0.000 <__array_function__ internals>:177(copyto)
        1    0.000    0.000  300.610  300.610 fitting.py:171(wrapper)
        1    0.000    0.000    0.000    0.000 iostream.py:208(schedule)
        1    0.000    0.000    0.000    0.000 function_base.py:537(asarray_chkfinite)
        2    0.000    0.000    0.000    0.000 {built-in method numpy.zeros}
        1    0.000    0.000    0.000    0.000 twodim_base.py:370(tri)
        1    0.000    0.000    0.000    0.000 {method 'outer' of 'numpy.ufunc' objects}
        3    0.000    0.000    0.012    0.004 __init__.py:1043(__init__)
        3    0.000    0.000    0.000    0.000 quantity.py:854(to_value)
        2    0.000    0.000    0.000    0.000 iostream.py:421(_is_master_process)
        1    0.000    0.000  300.610  300.610 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.print}
        1    0.000    0.000    0.000    0.000 twodim_base.py:161(eye)
        1    0.000    0.000    1.364    1.364 fitting.py:1713(_validate_model)
        2    0.000    0.000    0.000    0.000 core.py:1233(sync_constraints)
        1    0.000    0.000    0.000    0.000 twodim_base.py:491(triu)
        2    0.000    0.000    0.000    0.000 twodim_base.py:32(_min_int)
        1    0.000    0.000    0.065    0.065 _minpack_py.py:22(_check_func)
        2    0.000    0.000    0.001    0.000 <__array_function__ internals>:177(dot)
        1    0.000    0.000    0.000    0.000 threading.py:1126(is_alive)
        1    0.000    0.000    0.000    0.000 fitting.py:1544(_convert_input)
        2    0.000    0.000    0.000    0.000 iostream.py:440(_schedule_flush)
        2    0.000    0.000    0.000    0.000 <__array_function__ internals>:177(transpose)
        2    0.000    0.000    0.000    0.000 {built-in method numpy.arange}
        6    0.000    0.000    0.000    0.000 multiarray.py:1071(copyto)
        3    0.000    0.000    0.000    0.000 {method 'view' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 threading.py:1059(_wait_for_tstate_lock)
        1    0.000    0.000    0.000    0.000 blas.py:383(getter)
        1    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:1033(_handle_fromlist)
        1    0.000    0.000    0.000    0.000 shape_base.py:23(atleast_1d)
        2    0.000    0.000    0.000    0.000 fitting.py:1701(<genexpr>)
        2    0.000    0.000    0.000    0.000 {built-in method posix.getpid}
        2    0.000    0.000    0.000    0.000 fromnumeric.py:601(transpose)
        1    0.000    0.000    0.000    0.000 {method 'all' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 <__array_function__ internals>:177(where)
        1    0.000    0.000    0.000    0.000 iostream.py:97(_event_pipe)
        1    0.000    0.000    0.000    0.000 <__array_function__ internals>:177(atleast_1d)
        2    0.000    0.000    0.000    0.000 getlimits.py:658(min)
        1    0.000    0.000    0.000    0.000 fitting.py:1055(__init__)
        2    0.000    0.000    0.000    0.000 getlimits.py:671(max)
        1    0.000    0.000    0.000    0.000 <__array_function__ internals>:177(triu)
        2    0.000    0.000    0.000    0.000 {method '__exit__' of '_thread.RLock' objects}
        2    0.000    0.000    0.000    0.000 {method 'write' of '_io.StringIO' objects}
        1    0.000    0.000    0.000    0.000 {method 'flatten' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'acquire' of '_thread.lock' objects}
        1    0.000    0.000    0.000    0.000 _methods.py:60(_all)
        2    0.000    0.000    0.000    0.000 multiarray.py:736(dot)
        1    0.000    0.000    0.000    0.000 threading.py:529(is_set)
        2    0.000    0.000    0.000    0.000 fromnumeric.py:597(_transpose_dispatcher)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'astype' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'append' of 'collections.deque' objects}
        1    0.000    0.000    0.000    0.000 core.py:3486(fittable)
        1    0.000    0.000    0.000    0.000 core.py:3387(eqcons)
        1    0.000    0.000    0.000    0.000 multiarray.py:341(where)
        1    0.000    0.000    0.000    0.000 _misc.py:180(_datacopied)
        1    0.000    0.000    0.000    0.000 shape_base.py:19(_atleast_1d_dispatcher)
        1    0.000    0.000    0.000    0.000 twodim_base.py:432(_trilu_dispatcher)
        1    0.000    0.000    0.000    0.000 core.py:3395(ineqcons)
karllark commented 2 years ago

Just checking. Are you using the latest released version of astropy? Just checking as there was an astropy version around 4.0 that was quite slow for astropy.models.

jdtsmith commented 2 years ago

Using v5.0.3. I do think much of this time is under evaluate (model evaluation):

616674/4568    3.569    0.000  223.858    0.049 core.py:1030(_generic_evaluate)

This is good news, because there are several ways to speed up function eval.

jdtsmith commented 2 years ago

OK I decided to test the PAHFIT classic model function evaluation all by itself.

Here's Python vs. IDL PAHFIT timings (per model function call), simply evaluating the model with wavelength points uniformly spaced between 5 and 25µm, for different numbers of points.

You can see the issue right away. We have a huge amount of overhead in any model evaluation, even for a few points. Even at many wavelength samples (10k) the current function time only starts rising slightly.

pahfit_functime

jdtsmith commented 2 years ago

Updated with more points. Python eventually gets into the linear regime, at about 50% of IDL speed.

image
karllark commented 2 years ago

I would guess that the python may speed up significantly if we wrote a custom astropy model instead of using a Compound Model based on many individual models. This is how we started writing the python version, but then switched to Compound as it was conceptually easier to get working and to extend.

jdtsmith commented 2 years ago

Agreed that seems to be most of the overhead, also in the profile above.