LaurentRDC / scikit-ued

Collection of algorithms and routines for (ultrafast) electron diffraction and scattering
http://scikit-ued.readthedocs.io
GNU General Public License v3.0
134 stars 20 forks source link

baseline_dt doesn't work well for data with high spot and noisy background #38

Closed zxdawn closed 2 years ago

zxdawn commented 2 years ago

Version / Platform Info

Expected Behavior

Remove the noisy background and subtract it from the three high spots.

Something like this, but only three high spots:

import numpy as np
import matplotlib.pyplot as plt

data = np.load('test_baseline_dt.npy')
data_expected = np.where(data<1e-5, 0, data)
background_at_high_spot = 0.5e-5

fig, ax = plt.subplots(figsize=(12, 6))

m = ax.imshow(np.where(data_expected>0, data_expected-background_at_high_spot, data_expected))
plt.colorbar(m)

output

Actual Behavior

Only quite low values are detected as the baseline.

Minimal Example of Issue


import numpy as np
from skued import baseline_dt
import matplotlib.pyplot as plt

data = np.load('test_baseline_dt.npy')
# assign nan to quite large negative value
baseline = baseline_dt(np.nan_to_num(data, nan=-1e10), wavelet = 'qshift3', level=10, max_iter = 150)

# plot
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(12, 6))

m1 = ax1.imshow(data)
m2 = ax2.imshow(baseline, vmin=-5e-6)
m3 = ax3.imshow(data-baseline)

plt.colorbar(m1, ax=ax1, shrink=0.5)
plt.colorbar(m2, ax=ax2, shrink=0.5, extend='min')
plt.colorbar(m3, ax=ax3, shrink=0.5)

plt.tight_layout()

output

Data

test_baseline_dt.zip

LaurentRDC commented 2 years ago

Hey there,

There are many reasons why the performance may not be as good as you expected. Off the top of my head:

  1. The regions with NaNs might cause problem. I would imagine that if you replace the NaNs with a very large negative number, the extreme difference between the valid regions and the invalid regions will be hard to decompose into wavelets. Have you tried interpolating the value there, before removing the background?
  2. Have you tried different wavelets, other than qshift3?
  3. Have you tried other decomposition levels? This will have a very large impact on your results. Bigger isn't always better either, so try to play around.
  4. Finally, have you tried removing the background using baseline_dwt instead? If it performs better with the discrete wavelet transform, that might be an indication that something is going wrong.

Let me know, Laurent

zxdawn commented 2 years ago

Hi @LaurentRDC,

Thanks for your suggestions! Here's the new test_data.zip.

  1. There're nan values because I crop the data using some shapes. Anyway, I tried the original data with interpolation to fill nan values. The result is similar (column1: data; column2: baseline; column3: difference).

image

  1. The results are the same for all available wavelets.
import numpy as np
import skued
from skued import baseline_dt, baseline_dwt
import matplotlib.pyplot as plt

def plot_data_wavelet(data, wavelet, level, title, method='baseline_dt'):
    if method == 'baseline_dt':
        baseline = baseline_dt(data, wavelet=wavelet, level=level, max_iter = 150)
    elif method == 'baseline_dwt':
        baseline = baseline_dwt(data, wavelet=wavelet, level=level, max_iter = 150)

    # plot
    fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(12, 6))

    m1 = ax1.imshow(data, vmin=-1e-5, vmax=4e-5)
    m2 = ax2.imshow(baseline, vmin=-5e-6)
    m3 = ax3.imshow(data-baseline, vmin=-1e-5, vmax=4e-5)

    plt.colorbar(m1, ax=ax1, shrink=0.5)
    plt.colorbar(m2, ax=ax2, shrink=0.5, extend='min')
    plt.colorbar(m3, ax=ax3, shrink=0.5)

    plt.suptitle(title)
    plt.tight_layout()

data = np.load('test_baseline_dt.npy')

# test wavelet
for wavelet in skued.available_dt_filters():
    level = 10
    plot_data_wavelet(data, wavelet, level=level, title=f'wavelet={wavelet} with level={level}')

image

  1. Yes, the level can affect the results a lot. However, I can't get what I want by changing it.

image image image image

  1. Here're the results of baseline_dwt based on different wavelet. It seems baseline_dwt is not suitable

image image image

zxdawn commented 2 years ago

BTW, how about fourier transform? Is that better for this case? I can try that if so.

LaurentRDC commented 2 years ago

I'm sorry, I can't help you further. It may be that your data cannot be decomposed well with the dual-tree wavelets that are included in scikit-ued. You could try designing your own wavelet based on the references in the scikit-ued documentation.

As for the Fourier transform: best thing you can do is try! In my use-case, it didn't work very well... but it might work for you

zxdawn commented 2 years ago

Thanks for your help! Because my goal is to subtract the background from the high spots, I'm also trying to fit the lognorm distribution and adjust the alpha to get the background value ;)