Performance v0.14 vs. pyston

mkuemmel commented 2 years ago

I ran the deep fields from the morphology challenge in different configurations with v0.14 and with pyston. Let's discuss during tomorrow's telecon.

single sersic (levmar, 56c)

0.14      nobjects  time         pyston   nobjects  time
--------------------------------------------------------------------------
field_00  fails with 163849 after 1:28  170887    00:49:58
field_01  171719    01:37:46             171725    00:50:40
field_02  171091    01:38:03             171086    00:50:49
field_03  169864    01:38:04             169878    00:42:03   
field_04  fails with 146954 after 1:16   170891    00:50:58

disk+bulge (levmar, 56c)

0.14      nobjects  time         pyston   nobjects  time
--------------------------------------------------------------------------
field_00  137361   01:57:12             137367    01:25:01
field_01  137868   01:57:22             segfaults after 01:01:43 100554
field_02  137886   01:57:40             137881    01:41:10
field_03  136500   02:05:24             136485    01:28:30
field_04  137150   03:48:50             137150    02:59:09

disk+bulge (levmar, 28c)

0.14      nobjects  time         pyston   nobjects  time
--------------------------------------------------------------------------
field_00           01:46:09                       01:46:54
field_01           01:48:50             segfaults  after 01:15:27
field_02           01:47:35                       01:50:43
field_03           01:52:53                       01:49:57
field_04           03:08:08                       03:36:29

disk+bulge (levmar, 14c)

0.14      nobjects  time         pyston   nobjects  time
--------------------------------------------------------------------------
field_00           03:11:56                       03:27:53
field_01           03:15:37             segfaults after 02:30:08
field_02           03:15:56                       03:30:04
field_03           03:15:53                       03:27:34
field_04           05:09:34                       05:16:32

disk+sersic (levmar, 56c)

0.14      nobjects  time         pyston   nobjects  time
--------------------------------------------------------------------------
field_00 180898     03:03:38            180898    02:42:13
field_01 fails with 181037 after 2:15   181034    04:48:04
field_02 180925     03:40:35            180922    03:13:29
field_03 179515     03:29:19            179518    02:43:56    
field_04 fails with 180659 after 2:29   segfault after 2:19 180656

real morphology (levmar, 56c)

0.14      nobjects  time         pyston   nobjects  time
--------------------------------------------------------------------------
field_00  106712    01:01:43            106708    00:31:34
field_01  107819    01:03:40            107818    00:32:46
field_02  107023    01:01:10            107024    00:32:04
field_03  105682    01:00:34            105684    00:31:44
field_04  106811    01:02:03            106812    00:31:55

disk+bulge (gsl, 56c)

0.14      nobjects  time         pyston   nobjects  time
--------------------------------------------------------------------------
field_00 137361     03:36:42            137367    02:50:33
field_01 137868     24:05:07            137862    21:23:32
field_02 137886     04:08:21            137881    03:02:58
field_03 136500     04:12:04            136485    03:14:22
field_04 137150     12:24:19            137150    11:21:13

mkuemmel commented 2 years ago

Her the scaleability of SourceXtractor++ with pyston computed for the disk+bulge model from above: SEPP_scaling There is a broad shoulder from ~15-28 cores with a very god scaling. The processing node has 28 physical cores, using more than those (--> hyperthreading) affects the efficiency.

mkuemmel commented 2 years ago

Here the results of a more extensive investigation using more data, two different models (sersic and disk+bulge) and two different SE++ versions: performance

marcschefer commented 2 years ago

I guess tests are still underway?

mkuemmel commented 2 years ago

Yes, I ran over the weekend with the newest version, which includes also the iterative fitting (and of course pyston).

mkuemmel commented 2 years ago

Here a comparison of the SE++ performance version 0.14 vs. 0.16 (December'21) with pyston and meta iterations: pyston_iter_perform

Looks like the iterative fitting does cost some performance. On the other hand the example data is not very clumpy and not many priors are being used. 0.14 had still some bugs and the fitting of large groups (>2GB RAM) is not possible or shaky.

The performance values reported here are now showstoppers for the next release.

mkuemmel commented 2 years ago

Well, this can be closed now.

astrorama / SourceXtractorPlusPlus