Open astrofrog opened 9 years ago
Just to be clear, I believe we should be able to do better than resample but I was just surprised that a naive Numpy implementation was faster.
I looked into this a bit more and for small arrays the current implementation is the most efficient. For large arrays (>10000x10000) it looks like the following implementation is more efficient:
def downsample(image, factor):
nx_new = image.shape[1] // factor
ny_new = image.shape[0] // factor
nx_init = nx_new * factor
ny_init = ny_new * factor
return (image[:ny_init, :nx_init]
.reshape((ny_new, factor, nx_new, factor))
.mean(axis=3).mean(axis=1))
The timings are:
image_size factor current pure-numpy speedup
10 2 0.004103 0.046804 0.087662
10 3 0.004006 0.047328 0.084641
10 5 0.004108 0.045531 0.090229
10 10 0.003982 0.044102 0.090292
10 20 0.004007 0.042776 0.093671
100 2 0.023680 0.180519 0.131176
100 3 0.020192 0.177550 0.113724
100 5 0.018659 0.110810 0.168384
100 10 0.016670 0.082870 0.201162
100 20 0.016530 0.069902 0.236468
1000 2 7.999492 11.448002 0.698768
1000 3 3.766799 8.132291 0.463190
1000 5 3.664613 5.139089 0.713086
1000 10 2.858496 2.900219 0.985614
1000 20 2.293301 1.844788 1.243124
10000 2 2005.700111 1588.447094 1.262680
10000 3 900.852919 1007.672071 0.893994
10000 5 669.445992 625.443220 1.070355
10000 10 476.181030 328.037024 1.451608
10000 20 360.677004 220.120907 1.638540
15000 2 4883.916855 3471.009970 1.407059
15000 3 2229.454994 2352.994919 0.947497
15000 5 1655.629158 1420.667887 1.165388
15000 10 1157.596111 797.841072 1.450911
15000 20 840.254068 500.336170 1.679379
20000 2 11201.428175 12717.656136 0.880778
20000 3 4786.141872 4208.330870 1.137302
20000 5 3960.713148 2495.106936 1.587392
20000 10 2380.532026 1388.871193 1.714005
20000 20 1717.440844 853.933811 2.011211
Basically the speedup is up to a factor of 2, but only for very large images. We could simply add the pure-python implementation as an alternative, or we could simply leave it out. I don't think an automated switchover between the two is necessarily sensible because that limit might depend on the machine used.
The number of users that care about downsample
speed is probably very small or zero (because basically any other image processing task is more complex and slower) ... so I wouldn't put effort into this. But of course if anyone needs or wants to implement something that is fast for large arrays, pull requests are very welcome.
In APLpy I have a pure-python (well with Numpy) implementation of downsample called
resample
(which I wrote a while ago) and it turns out it can actually be faster thandownsample
here which is surprising:I wonder if the order of the looping has something to do with it, or maybe the fact that we're still accessing a Numpy array from Cython.