mikeblazanin / gcplyr

gcplyr is an R package that facilitates wrangling and analysis of microbial growth curve data
https://mikeblazanin.github.io/gcplyr/
Other
30 stars 2 forks source link

Fourier for smoothing? #97

Closed mikeblazanin closed 1 year ago

mikeblazanin commented 1 year ago

Pointers from Dave:

A low-pass filter is one technique you can use (often used in audio filtering to remove high-frequency noise). The basic idea is that you convert your data into the frequency spectrum using a fourier transform, then apply some sort of convolution filter (the simplest is that you multiply low frequency components by 1, and above some threshold frequency, by zero-- but you can also get into more complex versions) and then take the inverse fourier transform. What you get back is a smoothed dataset, now missing all the high frequency components.

Take a look at:

https://en.wikipedia.org/wiki/Low-pass_filter https://terpconnect.umd.edu/~toh/spectrum/FourierFilter.html https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/fft

The only tricky part is understanding what is produced in the first FFT. Different programs do this differently, but the 'usual' mathematical form is a symmetric function with frequencies that go from -n/2 to +n/2 where n is the number of data points you had in your time series (more precisely here it is the time span of your measured time series). Take a look at these and we can chat if you have questions. The pseudocode would be something like:

x is your time-series (just the values) z=FFT(x) where z will now be complex numbers (real and imaginary parts from which you can calculate a magnitude and phase - complex modulus and argument -- which tells you how important each frequency is in the time-series.
filter = some function that is composed of ones and zeros, transitioning at the correct frequency (this will need to be symmetric if the FFT gives a symmetric output) y=z*filter. apply the filter w=FFT(y, inverse->true) now back-transforms the data into a time-series, with the frequency filter applied.

mikeblazanin commented 1 year ago

Played around with Fourier to smooth. It doesn't work well at all for growth curve data, because even though the noise is high frequency, the signal isn't very oscillatory.

Example code:

dropvals <- function(x, n) {
  #n is n to drop from middle
  if(n > 0) {
    if(length(x)%%2 == 0 & n%%2 != 0) {stop("length(x) is even, n must be even")}
    if(length(x)%%2 == 1 & n%%2 != 1) {stop("length(x) is odd, n must be odd")}
    keepn <- (length(x)-n)/2
    x[(keepn+1):(length(x)-keepn)] <- 0
  }
  return(x)
}
invft <- function(x) {return(fft(x, inverse = TRUE)/length(x))}

y <- example_widedata$H8
ft <- fft(y)
plot(1:length(ft), ft)
inv0 <- invft(dropvals(ft, 0)) #drop no values, should re-create orig seq
inv1 <- invft(dropvals(ft, 1))
inv5 <- invft(dropvals(ft, 5))
inv11 <- invft(dropvals(ft, 11))
inv21 <- invft(dropvals(ft, 21))
inv81 <- invft(dropvals(ft, 81))

plot(1:length(y), y)
lines(1:length(inv0), inv0, lwd = 1.5)
lines(1:length(inv0), inv1, col = "gray10", lwd = 1.5)
lines(1:length(inv0), inv5, col = "gray30", lwd = 1.5)
lines(1:length(inv0), inv11, col = "gray50", lwd = 1.5)
lines(1:length(inv0), inv21, col = "gray70", lwd = 1.5)
lines(1:length(inv0), inv81, col = "gray80", lwd = 1.5)