Closed getsanjeev closed 5 years ago
@ebuchlin's code result
Total time running mgn: 8.29600000381 seconds
1178 function calls in 8.298 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.005 0.005 8.296 8.296 <ipython-input-7-ae2d858324e4>:6(function_timer)
1 0.919 0.919 8.290 8.290 <ipython-input-8-ae9334ae9d95>:4(mgn)
1 0.001 0.001 8.298 8.298 <string>:1(<module>)
20 0.000 0.000 0.001 0.000 _methods.py:31(_sum)
20 0.000 0.000 0.000 0.000 _ni_support.py:38(_extend_mode_to_code)
30 0.000 0.000 0.000 0.000 _ni_support.py:55(_normalize_sequence)
30 0.000 0.000 0.001 0.000 _ni_support.py:71(_get_output)
20 0.000 0.000 0.000 0.000 _ni_support.py:91(_check_axis)
20 0.000 0.000 0.003 0.000 _polybase.py:246(__init__)
20 0.001 0.000 0.004 0.000 _polybase.py:290(__call__)
1 0.000 0.000 0.000 0.000 common.py:75(wake)
20 0.001 0.000 7.061 0.353 filters.py:120(correlate1d)
20 0.001 0.000 0.008 0.000 filters.py:201(_gaussian_kernel1d)
20 0.001 0.000 7.070 0.353 filters.py:222(gaussian_filter1d)
10 0.001 0.000 7.072 0.707 filters.py:275(gaussian_filter)
1 0.000 0.000 0.000 0.000 ioloop.py:928(add_callback)
2 0.000 0.000 0.000 0.000 iostream.py:227(_is_master_process)
2 0.000 0.000 0.000 0.000 iostream.py:240(_schedule_flush)
2 0.000 0.000 0.001 0.000 iostream.py:308(write)
1 0.000 0.000 0.000 0.000 nddata.py:240(data)
90 0.000 0.000 0.000 0.000 numeric.py:414(asarray)
1 0.000 0.000 0.006 0.006 numeric.py:86(zeros_like)
20 0.002 0.000 0.002 0.000 polynomial.py:687(polyval)
20 0.001 0.000 0.002 0.000 polyutils.py:124(as_series)
20 0.000 0.000 0.000 0.000 polyutils.py:291(mapparms)
1 0.000 0.000 0.000 0.000 stack_context.py:253(wrap)
40 0.000 0.000 0.000 0.000 type_check.py:237(iscomplexobj)
20 0.000 0.000 0.001 0.000 type_check.py:544(common_type)
2 0.000 0.000 0.000 0.000 utf_8.py:15(decode)
2 0.000 0.000 0.000 0.000 {_codecs.utf_8_decode}
40 0.000 0.000 0.000 0.000 {any}
31 0.000 0.000 0.000 0.000 {hasattr}
72 0.000 0.000 0.000 0.000 {isinstance}
60 0.000 0.000 0.000 0.000 {issubclass}
101 0.000 0.000 0.000 0.000 {len}
20 0.000 0.000 0.000 0.000 {max}
1 0.000 0.000 0.000 0.000 {method 'append' of 'list' objects}
1 0.293 0.293 0.293 0.293 {method 'clip' of 'numpy.ndarray' objects}
20 0.000 0.000 0.000 0.000 {method 'copy' of 'numpy.ndarray' objects}
2 0.000 0.000 0.000 0.000 {method 'decode' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
20 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
20 0.000 0.000 0.000 0.000 {method 'reduce' of 'numpy.ufunc' objects}
20 0.000 0.000 0.000 0.000 {method 'reshape' of 'numpy.ndarray' objects}
1 0.000 0.000 0.000 0.000 {method 'send' of '_socket.socket' objects}
20 0.000 0.000 0.001 0.000 {method 'sum' of 'numpy.ndarray' objects}
2 0.000 0.000 0.000 0.000 {method 'write' of '_io.StringIO' objects}
20 0.000 0.000 0.000 0.000 {min}
2 0.000 0.000 0.000 0.000 {nt.getpid}
20 0.000 0.000 0.000 0.000 {numpy.core.multiarray.arange}
150 0.001 0.000 0.001 0.000 {numpy.core.multiarray.array}
1 0.006 0.006 0.006 0.006 {numpy.core.multiarray.copyto}
1 0.000 0.000 0.000 0.000 {numpy.core.multiarray.empty_like}
31 0.001 0.000 0.001 0.000 {numpy.core.multiarray.zeros}
40 0.000 0.000 0.000 0.000 {range}
20 7.059 0.353 7.059 0.353 {scipy.ndimage._nd_image.correlate1d}
1 0.000 0.000 0.000 0.000 {thread.get_ident}
2 0.000 0.000 0.000 0.000 {time.time}
@Cadair 's implementation
1387 function calls in 8.735 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.611 1.611 8.731 8.731 <ipython-input-6-c8ea54dbd3ae>:6(multiscale_gaussian)
1 0.004 0.004 8.735 8.735 <string>:1(<module>)
1 0.000 0.000 0.011 0.011 _methods.py:25(_amax)
1 0.000 0.000 0.010 0.010 _methods.py:28(_amin)
24 0.000 0.000 0.001 0.000 _methods.py:31(_sum)
24 0.000 0.000 0.000 0.000 _ni_support.py:38(_extend_mode_to_code)
36 0.000 0.000 0.000 0.000 _ni_support.py:55(_normalize_sequence)
36 0.001 0.000 0.001 0.000 _ni_support.py:71(_get_output)
24 0.000 0.000 0.000 0.000 _ni_support.py:91(_check_axis)
24 0.000 0.000 0.003 0.000 _polybase.py:246(__init__)
24 0.001 0.000 0.005 0.000 _polybase.py:290(__call__)
24 0.001 0.000 7.085 0.295 filters.py:120(correlate1d)
24 0.002 0.000 0.010 0.000 filters.py:201(_gaussian_kernel1d)
24 0.001 0.000 7.096 0.296 filters.py:222(gaussian_filter1d)
12 0.001 0.000 7.098 0.592 filters.py:275(gaussian_filter)
1 0.000 0.000 0.000 0.000 nddata.py:240(data)
1 0.000 0.000 0.000 0.000 numeric.py:148(ones)
108 0.000 0.000 0.001 0.000 numeric.py:414(asarray)
24 0.002 0.000 0.003 0.000 polynomial.py:687(polyval)
24 0.001 0.000 0.003 0.000 polyutils.py:124(as_series)
24 0.000 0.000 0.000 0.000 polyutils.py:291(mapparms)
48 0.000 0.000 0.001 0.000 type_check.py:237(iscomplexobj)
24 0.000 0.000 0.001 0.000 type_check.py:544(common_type)
48 0.000 0.000 0.000 0.000 {any}
36 0.000 0.000 0.000 0.000 {hasattr}
84 0.000 0.000 0.000 0.000 {isinstance}
72 0.000 0.000 0.000 0.000 {issubclass}
122 0.000 0.000 0.000 0.000 {len}
24 0.000 0.000 0.000 0.000 {max}
24 0.000 0.000 0.000 0.000 {method 'copy' of 'numpy.ndarray' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
24 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
1 0.000 0.000 0.011 0.011 {method 'max' of 'numpy.ndarray' objects}
1 0.000 0.000 0.010 0.010 {method 'min' of 'numpy.ndarray' objects}
26 0.022 0.001 0.022 0.001 {method 'reduce' of 'numpy.ufunc' objects}
24 0.000 0.000 0.000 0.000 {method 'reshape' of 'numpy.ndarray' objects}
24 0.000 0.000 0.001 0.000 {method 'sum' of 'numpy.ndarray' objects}
24 0.000 0.000 0.000 0.000 {min}
24 0.000 0.000 0.000 0.000 {numpy.core.multiarray.arange}
180 0.001 0.000 0.001 0.000 {numpy.core.multiarray.array}
1 0.000 0.000 0.000 0.000 {numpy.core.multiarray.copyto}
4 0.000 0.000 0.000 0.000 {numpy.core.multiarray.empty}
36 0.000 0.000 0.000 0.000 {numpy.core.multiarray.zeros}
48 0.000 0.000 0.000 0.000 {range}
24 7.083 0.295 7.083 0.295 {scipy.ndimage._nd_image.correlate1d}
1 0.000 0.000 0.000 0.000 {zip}
First we want to make sure the output is correct before we worry about CPU and RAM usage.
ok, I am checking on it
Testing on AIA_171_IMAGE
So there is a difference, we need to figure out why, what is better and how it all compares to the original paper and the IDL code.
There is at least the comment on the sunpy PR about one error in my implementation.
Also from that same comment thread an apparent discrepancy between the IDL and the paper.
This one, I believe https://github.com/sunpy/sunpy/pull/1899#discussion_r104646604
So after some tweaking in @Cadair 's code, I made something like this. (setting the weight a very small fractional value) There was some error about the conv variable as not being passed out by the arctan. So I corrected that as told by ebuchlin in the PR (sunpy)
Now, it looks something like this,
I am not sure we want to go down the road of tweaking the code here or there without a good reason to do so. Ideally we want to implement the algorithm as stated in the paper.
Testing on AIA_171_IMAGE
So after some tweaking in @Cadair 's code, I made something like this. (setting the weight a very small fractional value) There was some error about the conv variable as not being passed out by the arctan. So I corrected that as told by ebuchlin in the PR (sunpy)
Now, it looks something like this,
All I did was just set the weight instead of being 1 previously.
Also as told by ebuchlin here https://github.com/sunpy/sunpy/pull/1899#discussion_r104646604 "the weights are applied out of the arctan, not in the arctan argument."
I made the following changes
earlier
conv *= k
conv *= weight
np.arctan(conv, out=conv)
updated to
conv *= k
np.arctan(conv, out=conv)
conv *= weight
No other additional change
But this comes back to the original problem that the paper algorithm and the IDL algorithm do not match.
umm, you are right. Is it because of the noise not getting filtered out in the right way?
We don't know. We need someone to sit down with both and figure it out.
But this comes back to the original problem that the paper algorithm and the IDL algorithm do not match.
What I would like to understand is how the IDL code differs from the paper. If we can summarise this concisely I will email Huw and ask him about it.
Given the change that @prateekiiest made in my version of the code, I would have thought that it matches the paper. Perhaps that would be a good place to start @prateekiiest sit down with the paper, and both versions of the Python code and play a very detailed game of spot the algorithmic difference.
I think comparing speed of implementation is not required at this time. Let's get the code right.
I think it would be useful to identify the AIA image used in the paper. Note that the paper gets the time of the image wrong - there were no AIA images in 2005. So I think contacting Huw to get the actual date and time of that image would be a good start. We can do visual comparisons at least.
On Thu, Mar 1, 2018 at 5:56 AM, Stuart Mumford notifications@github.com wrote:
What I would like to understand is how the IDL code differs from the paper. If we can summarise this concisely I will email Huw and ask him about it.
Given the change that @prateekiiest https://github.com/prateekiiest made in my version of the code, I would have thought that it matches the paper. Perhaps that would be a good place to start @prateekiiest https://github.com/prateekiiest sit down with the paper, and both versions of the Python code and play a very detailed game of spot the algorithmic difference.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sunpy/sunkit-image/issues/1#issuecomment-369555358, or mute the thread https://github.com/notifications/unsubscribe-auth/AA8CF8fgpMSUoT-rv0SA3s2DrQsWx2kRks5tZ9PIgaJpZM4SEc9k .
The IDL code looks somewhat confusing in the end portion where it calculates the final image
The IDL code first takes the mean and then applies an arctan on the overall mean in the end.
The paper says it should be first sum over all the arctan values and then take the mean
This part of @Cadair 's code goes along with the paper but not the IDL one
secondly IDL code uses wrong constant values for h
, gamma
as opposed to what mentioned in the paper. That's however minor one (and not a code implementation defect)
I would be providing this comment link in my application
Some of the issues associated with the current implementation in ebuchlin's code are as follows:
I'll ask Huw what was intended.
On Sat, Mar 24, 2018 at 7:59 AM, Prateek Chanda notifications@github.com wrote:
I would be providing this comment link in my application
Some of the issues associated with the current implementation in ebuchlin's code are as follows:
- The function does not take user input for the minimum and maximum values of the input image
- Multi-processing is not supported in the current code.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sunpy/sunkit-image/issues/1#issuecomment-375879917, or mute the thread https://github.com/notifications/unsubscribe-auth/AA8CF_oH548w96rRno91RfYG7juzFYs8ks5thjU3gaJpZM4SEc9k .
I have gone through the above conversation and as well as the one on this PR. I went through both Cadair's and ebuchulin's implementation. All the codes are compared to what has been written in the paper. The differences are written below -
conv-=data
should be data-=conv
.conv**=2
has been done to calculate the sigmaw
but later to obtain the result this conv/=sigmaw
has been done in line 81. This is wrong. This essentially means we are dividing of (Bkw-B)2 by sigmaw
instead of dividing (B-Bkw) by sigmaw
.
arctan
should be done before multiplying with the weights
.scipy.ndimage.gaussian_filter
with mode
of the filter set to default value reflect
. See here.scipy.ndimage.gaussian_filter
is used with mode
set to nearest
which I feel is more accurate.return
statement is not at all according to the paper. The weighted mean has not been taken in the way it was written in the paper.arctan
is applied after taking the weighted mean of images obtained from various gaussian kernels rather the arctan
should be applied first to images obtained from individual gaussian kernels then their weighted mean should be taken.
@nabobalis These were the discrepancies which I was able to find out. How should I proceed further?
That is some through work. Good job.
So I guess the first step would to be correct the first (cadair's) implementation to match the paper and then see the changes made by Eric are useful and then incorporate them into the same code in some manner.
Regarding the IDl and the paper, if that is the only change, we can try to see if that change actually makes the result better. Otherwise we should email the original author and ask for their opinion on this change.
But I would hold on doing anything. This is getting closer to the GSoC project and you've done enough regarding this project before GSoC.
Thanks for the comparison! Just a few comments about what I understand from what my code does:
- The image is normalized before being processed which has not been mentioned in the paper.
Yes, but normalization from [a0, a1] to [0,1] here is equivalent to normalization that is done later as described in the paper in Eq. (1) (for one Gaussian kernel) and Eq. (4): in Eq. (1), with B' = (B-a0)/(a1-a0), we can substitute B by a0+ (a1-a0)*B', and we get exactly the same equation for C as a function of B' than as a function of B; this is because of the subtractions in Eqs. (1-2) (the offset a0 disappears) and the linearity of the convolution (there is the same factor (a1-a0) in the numerator and denominator of Eq. (1)). Also, Eq. (4) can be written C'g = B'^(1/γ).
- The weights gi as mentioned in the paper have not been used.
Yes, I followed "For most purposes, the gi can be set equal for all scales so that a straightforward mean of the locally normalised Ci is made." Such weights can easily be added.
- No global gamma transformed image C'g has been calculated, though this is clearly mentioned in the paper.
It is computed as image ** (1. / gamma)
in the return
statement, where image
is the normalized image (or B' above).
- At the end, the original image is used at the time of taking weighted sum instead of the global gamma transformed image.
No, it is the normalized image, at the power 1/γ, this is the same as the gamma-transformed original image (see above).
- The
return
statement is not at all according to the paper. The weighted mean has not been taken in the way it was written in the paper.
Yes, there are no weights (see above). Also, I use w = 1 - h, as I wanted this weight w to represent the strength of the MGN, not the weight of the gamma-transformed image (as h).
Apart from that and output normalization to [b0, b1], I don't see a difference with the paper.
- The values of a[0], a[1], b[0], b[1] has been assumed.
There are default parameters values, but the values can be changed by passing the arguments of the function.
- The calculated values after applying the transform has been again brought between (0,1) from (-pi/2, pi/2). This is no where mentioned in the paper. See line 47, 48.
Yes, this is to have the C'i on the same range [0,1] as C'g, so that the value of w (or h) is meaningful, and so that the output range is [b0, b1] as expected. But this is only equivalent to taking gi = 1/π for all i.
@ebuchlin Thanks for making things clear. I will have to go through your code again. It feels as if I somehow missed those small details.
That is some through work. Good job.
So I guess the first step would to be correct the first (cadair's) implementation to match the paper and then see the changes made by Eric are useful and then incorporate them into the same code in some manner.
Regarding the IDl and the paper, if that is the only change, we can try to see if that change actually makes the result better. Otherwise we should email the original author and ask for their opinion on this change.
But I would hold on doing anything. This is getting closer to the GSoC project and you've done enough regarding this project before GSoC.
I don't mind working on this even if I am not selected for GSOC. If you insist I will look into something else.
I won't insist, nor can I stop you working on this. I do not want you to feel hard done by if you do carry on and we don't select this project.
Okay. Hoping that I would be selected for GSOC, I will pick up this issue once the results are announced. Meanwhile I will work on my proposal.
@vatch123 That's a really nice breakdown, thanks.
The original attempt is located here (https://github.com/sunpy/sunpy/pull/1899). Please note there are is at least one issue with that code (https://github.com/sunpy/sunpy/pull/1899#discussion_r104646604).
This needs to be moved into this module and have the following done:
Base
Extra