yewsg / yews

Yews | Your Earthquake Waveform Solution
Apache License 2.0
14 stars 12 forks source link

improvements for detecting over large arrays with memory constraints #8

Closed lyndonboone closed 4 years ago

lyndonboone commented 4 years ago

chunks function now allows an optional argument offset which staggers the chunks to begin before the end of the previous chunk. This will allow chunking the array passed to detect function without altering the sliding window behaviour. Also changed the chunks function to slice over the last dimension of the input array. For 1D arrays, it works the same, but it now accepts 2D arrays and slices over the last dimension.

detect function in utils/detection.py now takes an optional argument size_limit. If specified, detect uses the chunk function to chunk the waveform array into chunks of size size_limit, computes probabilities for each chunk, appends to a list, and concatenates them all at the end. Output is the exact same as before regardless of whether size_limit is specified or not but should avoid having to compute the transforms (including to_tensor) on very large numpy arrays.

lijunzh commented 4 years ago

@lyndonboone Sorry to get you this late. My spam filter somehow put the notification email into junk. I will review your work and get back to it as soon as possible.

lijunzh commented 4 years ago

@lyndonboone The logic looks correct to me. Thanks a lot for the improvement. Do you mind adding some unit tests to ensure future changes? You can find some examples under tests/.

lijunzh commented 4 years ago

I will merge this PR first and start working on tests from a different PR.