jpjones76 / SeisIO.jl

Julia language support for geophysical time series data
http://seisio.readthedocs.org
Other
47 stars 21 forks source link

Extend demean! and detrend! functions to SeisChannel #10

Closed tclements closed 5 years ago

tclements commented 5 years ago

This PR allows the demean! and detrend! functions to operate on SeisChannels. This is relevant for ambient noise processing, in which we usually process one specific channel per thread for ease of parallelization.

tclements commented 5 years ago

Adding bandpass, bandstop, lowpass and highpass filters that will operate on AbstractArrays, SeisData and SeisChannel.

jpjones76 commented 5 years ago

Hi Tim,

I really like your proposed methods for detrend! and demean! on SeisChannel objects. Could you please submit them as a separate pull request? I can write tests for their code coverage very easily.

Unfortunately, I can't use the rest of your pull request at this time, so I'm going to close this for now. Some discussion is below.

detrend! and demean! for SeisData objects

The proposed changes would create a new SeisChannel for each channel in the SeisData object. This adds ~50% overhead to the computations with no benefit:

using BenchmarkTools
S = randSeisData(5, s=1.0)[2:5]
for i = 1:4; S.x[i] += ((0.01*rand()).*(1.0:1.0:Float64(length(S.x[i])))); end
T = deepcopy(S)

# old method 

@benchmark detrend_old!(S)
BenchmarkTools.Trial: 
  memory estimate:  103.11 MiB
  allocs estimate:  474
  --------------
  minimum time:     46.851 ms (8.52% GC)
  median time:      47.320 ms (8.52% GC)
  mean time:        48.707 ms (10.77% GC)
  maximum time:     109.818 ms (59.55% GC)
  --------------
  samples:          103
  evals/sample:     1

# proposed method

@benchmark detrend_new!(T)
BenchmarkTools.Trial: 
  memory estimate:  157.09 MiB
  allocs estimate:  476
  --------------
  minimum time:     67.762 ms (7.74% GC)
  median time:      71.447 ms (8.06% GC)
  mean time:        74.332 ms (10.34% GC)
  maximum time:     199.687 ms (64.52% GC)
  _--------------_
  samples:          68
  evals/sample:     1

filtering

I need data processing operations in SeisIO to be able to handle time gaps, so that they'll work on each data segment within each channel. These filters implicitly assume no time gaps. The latter forces users into a work flow of [download data] --> [fill time gaps] --> [filter], which is undesirable for some problems. I know that "per-segment" filtering adds a great deal of code complexity, and that's also why I'm slow to create new data processing routines, but adhering to that principle ultimately leads to much more robust code.

tclements commented 5 years ago

Great, I'll submit as separate PRs. For the filtering, I agree data gap checking would be nice. The demean, detrend and taper methods should be called before using these filters for proper signal processing.