JeschkeLab / DeerLab-Matlab

Data analysis and method development toolbox for dipolar EPR spectroscopy
MIT License
4 stars 2 forks source link

Support for distance vectors with non-uniform increment #83

Closed stestoll closed 4 years ago

stestoll commented 4 years ago

Is it possible to have DeerLab support r vectors with arbitrary (positive) increments?

For example, r = sqrt(linspace(1.5,6^2)).

Normalization of P in this case is easy, but it's unclear how to apply the delta-r scaling to the kernel matrix. K = K.*diff(r) doesn't work because diff(r) is one element too short.

luisfabib commented 4 years ago

Non-uniform distance vectors can be implemented in a very straightforward way:

Using trapz as integration function

The MATLAB trapz function allows the integration over non-uniform spaces, meaning that the Fredholm integral would need to be written as:

V = trapz(r,K.*P.',2)

This requires the dipolarkernel not to be normalized by dr and for the distance distribution to be normalized via P = P/trapz(r,P) instead of P = P/sum(P). This definition using the trapz function would also work for uniform spaces, without any difference.

Not content with this I managed to find a workaround:

Workaround

As it usually is, it is always good to go into the math of the problem. This is the equation for the trapezoidal integration over non-uniformly spaced points as given in the MATLAB documentation of the trapz function:

Annotation 2020-05-16 192836

Due to the apparently recursive nature of the method, i.e. the term r(k+1) - r(k), it might seem not possible to avoid the use of this formula to compute these integrals. However if one expands the product (r(k+1) - r(k))(f(k) + f(k+1)) in the equation above over the whole summation we obtain the following equation

CodeCogsEqn

meaning that if we have the function f(x) = K(t,r)P(r), then using the expanded equation above one can vectorize the equation such that by means of the following normalization vector

dr(2:end-1) = r(3:end) - r(1:end-2);
dr([1,end]) = r([2,end]) - r([1,end-1]);
dr = dr/2;

one can write the trapezoidal integration in matrix form via

K = K.*dr;
V = K*P;
luisfabib commented 4 years ago

Here is an example script which compares the old vs the new normalization mentioned above for a non-uniform and uniform distance distribution. This shows that the normalization above works as predicted.


clear,clc,clf

% Current normalization
%----------------------

t = linspace(0,5,300);
%Non-uniform
r = sqrt(linspace(1.5,6^2,800));
%Uniform
r2 = linspace(1,6,700);

% Normalization: P = P/sum(P)/dr
P = dd_gauss2(r,[4 0.95 0.5 2.5 0.5]);
P2 = dd_gauss2(r2,[4 0.95 0.5 2.5 0.5]);

% Normalization: K = K*dr
K = dipolarkernel(t,r);
K2 = dipolarkernel(t,r2);

% Integral transformation
V = K*P;
V2 = K2*P2;

subplot(221)
plot(r,P,'-',r2,P2,'--','LineWidth',1.5)
legend('P_{NU}','P_U')
axis tight, grid on
xlabel('r [nm]'),ylabel('P(r) [nm^{-1}]')
title('Before')
subplot(222)
plot(t,V,'-',t,V2,'--','LineWidth',1.5)
legend('K_{NU}P_{NU}','K_UP_U')
axis tight, grid on
xlabel('t [\mus]'),ylabel('V(t)')

% New normalization
%----------------------

t = linspace(0,5,300);
%Non-uniform
r = sqrt(linspace(1.5,8^2,550));
%Uniform
r2 = linspace(1,6,700);

dr = mean(diff(r));
dr2 = mean(diff(r2));

% Normalization: P = P/trapez(r,P)
P = dd_gauss2(r,[4 0.95 0.5 2.5 0.5])*dr;
P2 = dd_gauss2(r2,[4 0.95 0.5 2.5 0.5])*dr2;
P = P/trapz(r,P);
P2 = P2/trapz(r2,P2);

%Construct trapezoidal integration normalizaton vectors

%Non-uniform case:
dxtrapz = r;
dxtrapz(2:end-1) = r(3:end) - r(1:end-2);
dxtrapz([1,end]) = r([2,end]) - r([1,end-1]);
dxtrapz = dxtrapz/2;

%Non-uniform case:
dtrapz2 = r2;
dtrapz2(2:end-1) = r2(3:end) - r2(1:end-2);
dtrapz2([1,end]) = r2([2,end]) - r2([1,end-1]);
dtrapz2 = dtrapz2/2;

% New kernel normalization:
K = dipolarkernel(t,r)/dr.*dxtrapz;
K2 = dipolarkernel(t,r2)/dr2.*dtrapz2;

% Integral transformations
V  = K*P;
V2 = K2*P2;

subplot(223)
plot(r,P,'-',r2,P2,'--','LineWidth',1.5)
legend('P_{NU}','P_U')
axis tight, grid on
xlabel('r [nm]'),ylabel('P(r) [nm^{-1}]')
title('Now')
subplot(224)
plot(t,V,'-',t,V2,'--','LineWidth',1.5)
legend('K_{NU}P_{NU}','K_UP_U')
axis tight, grid on
xlabel('t [\mus]'),ylabel('V(t)')

untitled

stestoll commented 4 years ago

A few remarks:

  1. Kernel vs operator: We need to make a clearer distinction between the Fredholm kernel function K(t,r) and the linear Fredholm integral operator \hat{K} =\int_0^\infty d rK(t,r). In order to write V = K*P, K has to be a representation of the operator, not the kernel function. Hence, the function dipolarkernel actually returns the operator and not the kernel (unless r contains only one element), so the name is not entirely accurate. We need to be more careful in making this distinction.

  2. Riemann sum vs trapezoidal rule: So far, we have used an apparently simple Riemann sum for uniform increments, via K(i,j)==K(t_i,r_j)*dr. However, with this, K*P is not a Riemann sum over the specified range [min(r) max(r)], since it contains one more term than it should (either first or last, depending on whether you see it as left or right sum). The trapezoidal rule fixes this. One could also consider implementing Simpson's rule.

  3. K matrix columns: With moving from an (almost-)Riemann sum to more accurate integration, there are subtle changes in K. For uniformly spaced r, the amplitude of the first and the last column are now 1/2 of the other columns. That's as it should be to represent the integral operator, but then this

    t = linspace(0,2)
    r = 3:0.1:3.5;
    K = dipolarkernel(t,r);
    plot(t,K);

    looks unintuitive.

  4. K matrix condition number: The changed columns in K might affect the condition number and thereby the regularization problem. Using the trapezoidal rule increases the condition number in the above example. Compare:

    cond(K)
    cond(K.*[2 ones(1,numel(r)-2) 2])
  5. Regularization operators: With non-uniform increments in r, the regularization operator matrices (regoperator) also need to be adjusted. For example, for nonuniform spacing, the current second-order difference operator is not proportional to the second derivative. This is an issue that needs to be fixed.

luisfabib commented 4 years ago

Some comments:

  1. Kernel vs operator: Agreed. We need to be very clear about this distinction. The naming of the function might not be an issue since dipolarkernel could return the kernel matrix or the kernel integral operator.

  2. Riemann sum vs trapezoidal rule: Yes, the trapezoidal rule represents an important improvement w.r.t. to the Riemann sum. I am not sure the Simpson's rule would necessarily be an improvement. First, it is not compatible with non-uniformly incremented vectors and second its accuracy is already similar to the trapezoidal rule.

  3. K matrix condition number: I am aware that the condition number of the kernel integral operator can be larger than the one from the kernel matrix. The difference is, however, probaly negligible in comparison to the huge numbers the condition number typically adopts. Additionally, if one thinks about how non-uniform vectors would be used... One would typically define a non-uniform vector which has more point density where most distribution mass can be found. To get that same density with a uniform vector, much more points would be needed resulting in a significantly larger condition number compared to the non-uniform case.

  4. Regularization operators: the mathematical problem behind this has been solved in #91 and just the implementation is pending.

luisfabib commented 4 years ago

Closed with 3e9dfaff714b8dfc445c75a628e66fe13b344649