swiss-seismological-service / scdetect

A computationally efficient earthquake detection module for SeisComP
https://scdetect.readthedocs.io
GNU Affero General Public License v3.0
15 stars 6 forks source link

Multi-channel detections #62

Closed mmesim closed 2 years ago

mmesim commented 2 years ago

14.test.zip

The zip file includes the the files I used to run #14.test. The test uses one earthquake as template and the S-phase on the two horizontal components of the same station. It returns 6 detections. However, when the scdetect is applied on each channel separately I get 122, for the HHE, and 90 for the HHN. To my understanding the number of detections should be 90 or a little bit less than that.

Note: I did not use the attribute arrivalOffsetThreshold. It might be a good opportunity to explore that, i.e how should we tune the arrivalOffsetThreshold to get the right number of detections .

damb commented 2 years ago

Thanks for the report, @mmesim.

I'll have a look into this.

Note: I did not use the attribute arrivalOffsetThreshold. It might be a good opportunity to explore that, i.e how should we tune the arrivalOffsetThreshold to get the right number of detections .

Note that due to filtering small delays might have been introduced such that the original pick offsets might have slightly changed. However, this is just a wild guess which of course needs to be verified.

mmesim commented 2 years ago

Update: I re-ran the 14.test by adding "arrivalOffsetThreshold": 1. I assume that '1' is one second. I was able to get more detections (19), however the detection scheme failed to detect the template. This is very strange.

damb commented 2 years ago

Update: I re-ran the 14.test by adding "arrivalOffsetThreshold": 1. I assume that '1' is one second.

That's correct. From the README:

"arrivalOffsetThreshold": Maximum arrival offset in seconds (i.e. with regard to the template arrival) to tolerate when associating an arrival with an association. Note that the threshold is only relevant for a multi-stream detector setup.

damb commented 2 years ago

Ok, I had a look into it. Based on the data and configuration from above and the changes implemented by #72, I get the following results:

Configuration "arrivalOffsetThreshold" # detections
single stream: HHN, only - 113
single stream: HHE, only - 172
multi stream: HHN and HHE default 32
multi stream: HHN and HHE 0.0049 (slightly less than a sample) 32
multi stream: HHN and HHE 0.0051 (slightly more than a sample) 76
multi stream: HHN and HHE 0.0101 (slightly more than two samples) 91
multi stream: HHN and HHE 0.0151 (slightly more than three samples) 98
multi stream: HHN and HHE 0.0201 (slightly more than four samples) 102
multi stream: HHN and HHE >= 0.0251 (slightly more than five samples) 106

Overall, the results seem kind of reasonable. However, it is quite interesting that some detections are detected only when taking a small arrival offset (i.e. a few samples) into account (Note, that I checked the implementation for bugs, though, I couldn't find any issues.).

What do you think, @mmesim?

mmesim commented 2 years ago

It seems that works fine now! I'll test it after the holidays and take a closer look at the detections missed/ or possible false detections. Then, if everything works fine, we can close the issue. I'll also test the multistation detection (#66).

Thanks @damb!

damb commented 2 years ago

I'm going to merge #72. Let's keep this open until you've performed the final checks.

mmesim commented 2 years ago

@damb What dataset did you use for this https://github.com/damb/scdetect/issues/62#issuecomment-998854827 ?

damb commented 2 years ago

The one you provided when opening the issue.

mmesim commented 2 years ago

Well, I have an update.

Test-12 : HHE | Ndetections : 122 | cat12.txt [catalog with origin times & CC]

[{
  "detectorId": "detector-01",
  "createArrivals": true,
  "createTemplateArrivals": false,
  "gapInterpolation": true,
  "gapThreshold": 0.1,
  "gapTolerance": 1.5,
  "triggerDuration": 0.5,
  "triggerOnThreshold": 0.5,
  "triggerOffThreshold": 0.3,
  "originId": "smi:ch.ethz.sed/sc3a/origin/NLL.20191105125505.255283.1897990",
  "filter": "BW_BP(2,1.5,15)",
  "templateFilter": "BW_BP(2,1.5,15)",
  "streams": [
    {
      "templateId": "template-HHE",
      "initTime": 10,
      "templateWaveformStart": -0.5,
      "templateWaveformEnd": 2,
      "waveformId": "8D.RAW2..HHE",
      "templateWaveformId": "8D.RAW2..HHE",
      "templatePhase": "Sg"
   }
 ]
}]

Test-13 : HHN | Ndetections : 90 | cat13.txt [catalog with origin times & CC]

[{
  "detectorId": "detector-01",
  "createArrivals": true,
  "createTemplateArrivals": false,
  "gapInterpolation": true,
  "gapThreshold": 0.1,
  "gapTolerance": 1.5,
  "triggerDuration": 0.5,
  "triggerOnThreshold": 0.5,
  "triggerOffThreshold": 0.3,
  "originId": "smi:ch.ethz.sed/sc3a/origin/NLL.20191105125505.255283.1897990",
  "filter": "BW_BP(2,1.5,15)",
  "templateFilter": "BW_BP(2,1.5,15)",
  "streams": [
    {
      "templateId": "template-HHN",
      "initTime": 10,
      "templateWaveformStart": -0.5,
      "templateWaveformEnd": 2,
      "waveformId": "8D.RAW2..HHN",
      "templateWaveformId": "8D.RAW2..HHN",
      "templatePhase": "Sg"
   }
 ]
}]

Test-17b : HHN & HHE | Ndetections : 57 | results.txt [catalog with origin times & CC] cat12.txt cat13.txt results.txt

[{
  "detectorId": "detector-01",
  "createArrivals": true,
  "createTemplateArrivals": false,
  "gapInterpolation": true,
  "gapThreshold": 0.1,
  "gapTolerance": 1.5,
  "triggerDuration": 0.5,
  "triggerOnThreshold": 0.5,
  "triggerOffThreshold": 0.0,
  "originId": "smi:ch.ethz.sed/sc3a/origin/NLL.20191105125505.255283.1897990",
  "arrivalOffsetThreshold":1,
  "filter": "BW_BP(2,1.5,15)",
  "templateFilter": "BW_BP(2,1.5,15)",
  "streams": [
    {
      "templateId": "template-01",
      "initTime": 10,
      "templateWaveformStart": -0.5,
      "templateWaveformEnd": 2,
      "waveformId": "8D.RAW2..HHE",
      "templateWaveformId": "8D.RAW2..HHE",
      "templatePhase": "Sg"
   },
   {
      "templateId": "template-02",
      "initTime": 10,
      "templateWaveformStart": -0.5,
      "templateWaveformEnd": 2,
      "waveformId": "8D.RAW2..HHN",
      "templateWaveformId": "8D.RAW2..HHN",
      "templatePhase": "Sg"
   }
 ]
}]

The expected.results.txt is a catalog of the common origin times when the detections are performed on individual channel. expected.results.txt It seems that the "common events" are 58. However the origin times differ from the results.txt.

Here common.results.txt are the common events from the expected.results and the actual results.

Is this a problem of bookkeeping? Perhaps indexing?

Also, I'm not sure I can find so many detections as you did.

damb commented 2 years ago

Thanks for your feedback.

Is this a problem of bookkeeping? Perhaps indexing?

Currently, it is difficult to make any assumptions. I need to have a closer look into it. I'll let you know once I know more.

mmesim commented 2 years ago

I have another update. The numbers in https://github.com/damb/scdetect/issues/62#issuecomment-1030027759 are not correct but the problem is still the same. I'll write a new comment in a few minutes.

mmesim commented 2 years ago

Test-12 : HHE | Ndetections : 172 | cat12.txt [catalog with origin times & CC] [json the same as before] cat12.txt

Test-13 : HHN | Ndetections : 113 | cat13.txt [catalog with origin times & CC] [json the same as before] cat13.txt

Common events on both catalogs (somehow expected results): 81 detections expected.results.txt

Test-17b : HHN & HHE | Ndetections : 98 | results.txt [catalog with origin times & CC] [json the same as before] results.txt

Origin times do not much.

I also add the scml files. scml.files.zip

damb commented 2 years ago

@mmesim, so your single stream results agree with my results from https://github.com/damb/scdetect/issues/62#issuecomment-998854827, right? (at least regarding the number of detections).

The multi-stream detection is different, though. As I said already, this requires further investigation.

mmesim commented 2 years ago

Yes. They agree. I re-ran everything to be sure.

damb commented 2 years ago

@mmesim, how do you determine the expected results? Do you use a threshold (if yes, how large)?

When computing the strictly origin time base intersection of the single stream results, I obtain 34 events (recall, that you received 81 common events):

$ join HHN/detections.csv HHE/detections.csv
2019-11-05T04:01:11.880489 46.32480762 7.36023502 4.813476562 0.792408 46.32480762 7.36023502 4.813476562 0.905446
2019-11-05T04:02:00.405489 46.32480762 7.36023502 4.813476562 0.731168 46.32480762 7.36023502 4.813476562 0.852139
2019-11-05T04:06:35.875489 46.32480762 7.36023502 4.813476562 0.666347 46.32480762 7.36023502 4.813476562 0.791470
2019-11-05T04:07:39.020489 46.32480762 7.36023502 4.813476562 0.661300 46.32480762 7.36023502 4.813476562 0.881030
2019-11-05T04:07:52.960489 46.32480762 7.36023502 4.813476562 0.769363 46.32480762 7.36023502 4.813476562 0.867501
2019-11-05T04:09:18.295489 46.32480762 7.36023502 4.813476562 0.743638 46.32480762 7.36023502 4.813476562 0.833647
2019-11-05T04:10:00.660489 46.32480762 7.36023502 4.813476562 0.860478 46.32480762 7.36023502 4.813476562 0.947677
2019-11-05T04:10:42.285489 46.32480762 7.36023502 4.813476562 0.887233 46.32480762 7.36023502 4.813476562 0.927314
2019-11-05T04:11:31.965489 46.32480762 7.36023502 4.813476562 0.865259 46.32480762 7.36023502 4.813476562 0.916078
2019-11-05T04:12:08.450489 46.32480762 7.36023502 4.813476562 0.759779 46.32480762 7.36023502 4.813476562 0.922230
2019-11-05T04:12:34.560489 46.32480762 7.36023502 4.813476562 0.725790 46.32480762 7.36023502 4.813476562 0.834731
2019-11-05T04:22:10.750489 46.32480762 7.36023502 4.813476562 0.787805 46.32480762 7.36023502 4.813476562 0.845021
2019-11-05T04:22:31.425489 46.32480762 7.36023502 4.813476562 0.833067 46.32480762 7.36023502 4.813476562 0.948685
2019-11-05T04:23:29.035489 46.32480762 7.36023502 4.813476562 0.958145 46.32480762 7.36023502 4.813476562 0.966727
2019-11-05T04:23:47.640489 46.32480762 7.36023502 4.813476562 1.000000 46.32480762 7.36023502 4.813476562 1.000000
2019-11-05T04:24:10.415489 46.32480762 7.36023502 4.813476562 0.767975 46.32480762 7.36023502 4.813476562 0.843972
2019-11-05T04:25:11.530489 46.32480762 7.36023502 4.813476562 0.709522 46.32480762 7.36023502 4.813476562 0.832797
2019-11-05T04:25:41.780489 46.32480762 7.36023502 4.813476562 0.611660 46.32480762 7.36023502 4.813476562 0.776158
2019-11-05T04:27:30.130489 46.32480762 7.36023502 4.813476562 0.673746 46.32480762 7.36023502 4.813476562 0.715995
2019-11-05T04:28:09.030489 46.32480762 7.36023502 4.813476562 0.783611 46.32480762 7.36023502 4.813476562 0.933806
2019-11-05T04:29:32.880489 46.32480762 7.36023502 4.813476562 0.706178 46.32480762 7.36023502 4.813476562 0.807481
2019-11-05T04:30:43.940489 46.32480762 7.36023502 4.813476562 0.672202 46.32480762 7.36023502 4.813476562 0.894676
2019-11-05T04:31:30.415489 46.32480762 7.36023502 4.813476562 0.608395 46.32480762 7.36023502 4.813476562 0.831481
2019-11-05T04:33:17.385489 46.32480762 7.36023502 4.813476562 0.733866 46.32480762 7.36023502 4.813476562 0.819647
2019-11-05T04:36:24.710489 46.32480762 7.36023502 4.813476562 0.705958 46.32480762 7.36023502 4.813476562 0.872490
2019-11-05T04:37:27.165489 46.32480762 7.36023502 4.813476562 0.577804 46.32480762 7.36023502 4.813476562 0.792940
2019-11-05T04:40:13.870489 46.32480762 7.36023502 4.813476562 0.820884 46.32480762 7.36023502 4.813476562 0.913295
2019-11-05T04:41:29.070489 46.32480762 7.36023502 4.813476562 0.848254 46.32480762 7.36023502 4.813476562 0.885349
2019-11-05T04:43:55.615489 46.32480762 7.36023502 4.813476562 0.670620 46.32480762 7.36023502 4.813476562 0.741754
2019-11-05T04:45:32.535489 46.32480762 7.36023502 4.813476562 0.556103 46.32480762 7.36023502 4.813476562 0.702238
2019-11-05T04:45:57.395489 46.32480762 7.36023502 4.813476562 0.651538 46.32480762 7.36023502 4.813476562 0.690662
2019-11-05T04:50:58.145489 46.32480762 7.36023502 4.813476562 0.775016 46.32480762 7.36023502 4.813476562 0.911228
2019-11-05T04:53:09.545489 46.32480762 7.36023502 4.813476562 0.554357 46.32480762 7.36023502 4.813476562 0.825796
2019-11-05T04:59:13.770489 46.32480762 7.36023502 4.813476562 0.745623 46.32480762 7.36023502 4.813476562 0.883018
damb commented 2 years ago

When performing the test on both streams (i.e. HHN and HHE) the "arrivalOffsetThreshold" was set to 1 (second). This is comparably large, since the goal is to associate arrivals on a single sensor location. Instead, I would rather set the value to a few samples offset (given the data is sampled with 200Hz). Are you aware of this issue?

damb commented 2 years ago

Besides, I observed that the number of results mostly depends on the value set for the "triggerDuration" configuration parameter.

The implementation is rather clumsy: https://github.com/damb/scdetect/blob/52d2d780806a0b122fc08a3f6baf474a0f50804f/src/apps/cc/detector/detector.cpp#L315-L318

In order to make it independent from whether the data is processed in real-time-mode or whether in offline-mode the trigger endtime is compared to the endtime of the most recent record processed from the underlying so-called triggering template waveform processor. Due to the fact that data is fed record-wise it's resolution is basically bound to the

For sure, this is not ideal, though, I didn't come up with a better solution, yet. Do you have a better idea, @mmesim?

mmesim commented 2 years ago

@mmesim, how do you determine the expected results? Do you use a threshold (if yes, how large)?

To determine the "expected results" I compare the origin times of the two catalogs. I don't use a threshold, instead I compare origin times to one decimal place for msec, e.g. 2019-11-05T04:01:11.8. I use intersection command in python to compare the two lists.

mmesim commented 2 years ago

When performing the test on both streams (i.e. HHN and HHE) the "arrivalOffsetThreshold" was set to 1 (second). This is comparably large, since the goal is to associate arrivals on a single sensor location. Instead, I would rather set the value to a few samples offset (given the data is sampled with 200Hz). Are you aware of this issue?

Yes, I'm aware. It might be too large, however the problem is not anymore the number of detections. The number of detections has been improved after you fixed #71. The problem now is that the origin times, after joint detection, do not make any sense.

damb commented 2 years ago

I use intersection command in python to compare the two lists.

You mean https://docs.python.org/3/library/stdtypes.html#frozenset.intersection?

mmesim commented 2 years ago

Yes

damb commented 2 years ago

Ok, here an update (tests were performed based on #85):

Configuration "arrivalOffsetThreshold" # detections
single stream: HHN, only - 226
single stream: HHE, only - 350
multi stream: HHN and HHE default 62
multi stream: HHN and HHE 0.0049 (slightly less than a sample) 62
multi stream: HHN and HHE 0.0051 (slightly more than a sample) 151
multi stream: HHN and HHE 0.0101 (slightly more than two samples) 183
multi stream: HHN and HHE 0.0151 (slightly more than three samples) 197
multi stream: HHN and HHE 0.0201 (slightly more than four samples) 204
multi stream: HHN and HHE >= 0.0251 (slightly more than five samples) 211
damb commented 2 years ago

For sure, this is not ideal, though, I didn't come up with a better solution, yet. Do you have a better idea, @mmesim?

This should be solved with #85, as well.

mmesim commented 2 years ago

I confirm the number of detections. "Expected" results and results from multi-channel detections agree! It works!!!

Also, I checked the detections by plotting the waveforms. Everything is fine.

I'm in the process of checking #66.