Closed mmesim closed 2 years ago
Thanks for the report, @mmesim.
I'll have a look into this.
Note: I did not use the attribute arrivalOffsetThreshold. It might be a good opportunity to explore that, i.e how should we tune the arrivalOffsetThreshold to get the right number of detections .
Note that due to filtering small delays might have been introduced such that the original pick offsets might have slightly changed. However, this is just a wild guess which of course needs to be verified.
Update: I re-ran the 14.test by adding "arrivalOffsetThreshold": 1
. I assume that '1' is one second. I was able to get more detections (19), however the detection scheme failed to detect the template. This is very strange.
Update: I re-ran the 14.test by adding
"arrivalOffsetThreshold": 1
. I assume that '1' is one second.
That's correct. From the README
:
"arrivalOffsetThreshold"
: Maximum arrival offset in seconds (i.e. with regard to the template arrival) to tolerate when associating an arrival with an association. Note that the threshold is only relevant for a multi-stream detector setup.
Ok, I had a look into it. Based on the data and configuration from above and the changes implemented by #72, I get the following results:
Configuration | "arrivalOffsetThreshold" |
# detections |
---|---|---|
single stream: HHN , only |
- | 113 |
single stream: HHE , only |
- | 172 |
multi stream: HHN and HHE |
default |
32 |
multi stream: HHN and HHE |
0.0049 (slightly less than a sample) |
32 |
multi stream: HHN and HHE |
0.0051 (slightly more than a sample) |
76 |
multi stream: HHN and HHE |
0.0101 (slightly more than two samples) |
91 |
multi stream: HHN and HHE |
0.0151 (slightly more than three samples) |
98 |
multi stream: HHN and HHE |
0.0201 (slightly more than four samples) |
102 |
multi stream: HHN and HHE |
>= 0.0251 (slightly more than five samples) |
106 |
200Hz
i.e. a sampling interval of 0.005s
."arrivalOffsetThreshold"
> 6 samples, I didn't obtain more than 106 detections. Apparently, the difference from HHN
, only (i.e. 113) and 106 (i.e. for multi-stream) may be related to false positives on HHN
.Overall, the results seem kind of reasonable. However, it is quite interesting that some detections are detected only when taking a small arrival offset (i.e. a few samples) into account (Note, that I checked the implementation for bugs, though, I couldn't find any issues.).
What do you think, @mmesim?
It seems that works fine now! I'll test it after the holidays and take a closer look at the detections missed/ or possible false detections. Then, if everything works fine, we can close the issue. I'll also test the multistation detection (#66).
Thanks @damb!
I'm going to merge #72. Let's keep this open until you've performed the final checks.
@damb What dataset did you use for this https://github.com/damb/scdetect/issues/62#issuecomment-998854827 ?
The one you provided when opening the issue.
Well, I have an update.
Test-12 : HHE | Ndetections : 122 | cat12.txt [catalog with origin times & CC]
[{
"detectorId": "detector-01",
"createArrivals": true,
"createTemplateArrivals": false,
"gapInterpolation": true,
"gapThreshold": 0.1,
"gapTolerance": 1.5,
"triggerDuration": 0.5,
"triggerOnThreshold": 0.5,
"triggerOffThreshold": 0.3,
"originId": "smi:ch.ethz.sed/sc3a/origin/NLL.20191105125505.255283.1897990",
"filter": "BW_BP(2,1.5,15)",
"templateFilter": "BW_BP(2,1.5,15)",
"streams": [
{
"templateId": "template-HHE",
"initTime": 10,
"templateWaveformStart": -0.5,
"templateWaveformEnd": 2,
"waveformId": "8D.RAW2..HHE",
"templateWaveformId": "8D.RAW2..HHE",
"templatePhase": "Sg"
}
]
}]
Test-13 : HHN | Ndetections : 90 | cat13.txt [catalog with origin times & CC]
[{
"detectorId": "detector-01",
"createArrivals": true,
"createTemplateArrivals": false,
"gapInterpolation": true,
"gapThreshold": 0.1,
"gapTolerance": 1.5,
"triggerDuration": 0.5,
"triggerOnThreshold": 0.5,
"triggerOffThreshold": 0.3,
"originId": "smi:ch.ethz.sed/sc3a/origin/NLL.20191105125505.255283.1897990",
"filter": "BW_BP(2,1.5,15)",
"templateFilter": "BW_BP(2,1.5,15)",
"streams": [
{
"templateId": "template-HHN",
"initTime": 10,
"templateWaveformStart": -0.5,
"templateWaveformEnd": 2,
"waveformId": "8D.RAW2..HHN",
"templateWaveformId": "8D.RAW2..HHN",
"templatePhase": "Sg"
}
]
}]
Test-17b : HHN & HHE | Ndetections : 57 | results.txt [catalog with origin times & CC] cat12.txt cat13.txt results.txt
[{
"detectorId": "detector-01",
"createArrivals": true,
"createTemplateArrivals": false,
"gapInterpolation": true,
"gapThreshold": 0.1,
"gapTolerance": 1.5,
"triggerDuration": 0.5,
"triggerOnThreshold": 0.5,
"triggerOffThreshold": 0.0,
"originId": "smi:ch.ethz.sed/sc3a/origin/NLL.20191105125505.255283.1897990",
"arrivalOffsetThreshold":1,
"filter": "BW_BP(2,1.5,15)",
"templateFilter": "BW_BP(2,1.5,15)",
"streams": [
{
"templateId": "template-01",
"initTime": 10,
"templateWaveformStart": -0.5,
"templateWaveformEnd": 2,
"waveformId": "8D.RAW2..HHE",
"templateWaveformId": "8D.RAW2..HHE",
"templatePhase": "Sg"
},
{
"templateId": "template-02",
"initTime": 10,
"templateWaveformStart": -0.5,
"templateWaveformEnd": 2,
"waveformId": "8D.RAW2..HHN",
"templateWaveformId": "8D.RAW2..HHN",
"templatePhase": "Sg"
}
]
}]
The expected.results.txt is a catalog of the common origin times when the detections are performed on individual channel. expected.results.txt It seems that the "common events" are 58. However the origin times differ from the results.txt.
Here common.results.txt are the common events from the expected.results and the actual results.
Is this a problem of bookkeeping? Perhaps indexing?
Also, I'm not sure I can find so many detections as you did.
Thanks for your feedback.
Is this a problem of bookkeeping? Perhaps indexing?
Currently, it is difficult to make any assumptions. I need to have a closer look into it. I'll let you know once I know more.
I have another update. The numbers in https://github.com/damb/scdetect/issues/62#issuecomment-1030027759 are not correct but the problem is still the same. I'll write a new comment in a few minutes.
Test-12 : HHE | Ndetections : 172 | cat12.txt [catalog with origin times & CC] [json the same as before] cat12.txt
Test-13 : HHN | Ndetections : 113 | cat13.txt [catalog with origin times & CC] [json the same as before] cat13.txt
Common events on both catalogs (somehow expected results): 81 detections expected.results.txt
Test-17b : HHN & HHE | Ndetections : 98 | results.txt [catalog with origin times & CC] [json the same as before] results.txt
Origin times do not much.
I also add the scml files. scml.files.zip
@mmesim, so your single stream results agree with my results from https://github.com/damb/scdetect/issues/62#issuecomment-998854827, right? (at least regarding the number of detections).
The multi-stream detection is different, though. As I said already, this requires further investigation.
Yes. They agree. I re-ran everything to be sure.
@mmesim, how do you determine the expected results? Do you use a threshold (if yes, how large)?
When computing the strictly origin time base intersection of the single stream results, I obtain 34 events (recall, that you received 81 common events):
$ join HHN/detections.csv HHE/detections.csv
2019-11-05T04:01:11.880489 46.32480762 7.36023502 4.813476562 0.792408 46.32480762 7.36023502 4.813476562 0.905446
2019-11-05T04:02:00.405489 46.32480762 7.36023502 4.813476562 0.731168 46.32480762 7.36023502 4.813476562 0.852139
2019-11-05T04:06:35.875489 46.32480762 7.36023502 4.813476562 0.666347 46.32480762 7.36023502 4.813476562 0.791470
2019-11-05T04:07:39.020489 46.32480762 7.36023502 4.813476562 0.661300 46.32480762 7.36023502 4.813476562 0.881030
2019-11-05T04:07:52.960489 46.32480762 7.36023502 4.813476562 0.769363 46.32480762 7.36023502 4.813476562 0.867501
2019-11-05T04:09:18.295489 46.32480762 7.36023502 4.813476562 0.743638 46.32480762 7.36023502 4.813476562 0.833647
2019-11-05T04:10:00.660489 46.32480762 7.36023502 4.813476562 0.860478 46.32480762 7.36023502 4.813476562 0.947677
2019-11-05T04:10:42.285489 46.32480762 7.36023502 4.813476562 0.887233 46.32480762 7.36023502 4.813476562 0.927314
2019-11-05T04:11:31.965489 46.32480762 7.36023502 4.813476562 0.865259 46.32480762 7.36023502 4.813476562 0.916078
2019-11-05T04:12:08.450489 46.32480762 7.36023502 4.813476562 0.759779 46.32480762 7.36023502 4.813476562 0.922230
2019-11-05T04:12:34.560489 46.32480762 7.36023502 4.813476562 0.725790 46.32480762 7.36023502 4.813476562 0.834731
2019-11-05T04:22:10.750489 46.32480762 7.36023502 4.813476562 0.787805 46.32480762 7.36023502 4.813476562 0.845021
2019-11-05T04:22:31.425489 46.32480762 7.36023502 4.813476562 0.833067 46.32480762 7.36023502 4.813476562 0.948685
2019-11-05T04:23:29.035489 46.32480762 7.36023502 4.813476562 0.958145 46.32480762 7.36023502 4.813476562 0.966727
2019-11-05T04:23:47.640489 46.32480762 7.36023502 4.813476562 1.000000 46.32480762 7.36023502 4.813476562 1.000000
2019-11-05T04:24:10.415489 46.32480762 7.36023502 4.813476562 0.767975 46.32480762 7.36023502 4.813476562 0.843972
2019-11-05T04:25:11.530489 46.32480762 7.36023502 4.813476562 0.709522 46.32480762 7.36023502 4.813476562 0.832797
2019-11-05T04:25:41.780489 46.32480762 7.36023502 4.813476562 0.611660 46.32480762 7.36023502 4.813476562 0.776158
2019-11-05T04:27:30.130489 46.32480762 7.36023502 4.813476562 0.673746 46.32480762 7.36023502 4.813476562 0.715995
2019-11-05T04:28:09.030489 46.32480762 7.36023502 4.813476562 0.783611 46.32480762 7.36023502 4.813476562 0.933806
2019-11-05T04:29:32.880489 46.32480762 7.36023502 4.813476562 0.706178 46.32480762 7.36023502 4.813476562 0.807481
2019-11-05T04:30:43.940489 46.32480762 7.36023502 4.813476562 0.672202 46.32480762 7.36023502 4.813476562 0.894676
2019-11-05T04:31:30.415489 46.32480762 7.36023502 4.813476562 0.608395 46.32480762 7.36023502 4.813476562 0.831481
2019-11-05T04:33:17.385489 46.32480762 7.36023502 4.813476562 0.733866 46.32480762 7.36023502 4.813476562 0.819647
2019-11-05T04:36:24.710489 46.32480762 7.36023502 4.813476562 0.705958 46.32480762 7.36023502 4.813476562 0.872490
2019-11-05T04:37:27.165489 46.32480762 7.36023502 4.813476562 0.577804 46.32480762 7.36023502 4.813476562 0.792940
2019-11-05T04:40:13.870489 46.32480762 7.36023502 4.813476562 0.820884 46.32480762 7.36023502 4.813476562 0.913295
2019-11-05T04:41:29.070489 46.32480762 7.36023502 4.813476562 0.848254 46.32480762 7.36023502 4.813476562 0.885349
2019-11-05T04:43:55.615489 46.32480762 7.36023502 4.813476562 0.670620 46.32480762 7.36023502 4.813476562 0.741754
2019-11-05T04:45:32.535489 46.32480762 7.36023502 4.813476562 0.556103 46.32480762 7.36023502 4.813476562 0.702238
2019-11-05T04:45:57.395489 46.32480762 7.36023502 4.813476562 0.651538 46.32480762 7.36023502 4.813476562 0.690662
2019-11-05T04:50:58.145489 46.32480762 7.36023502 4.813476562 0.775016 46.32480762 7.36023502 4.813476562 0.911228
2019-11-05T04:53:09.545489 46.32480762 7.36023502 4.813476562 0.554357 46.32480762 7.36023502 4.813476562 0.825796
2019-11-05T04:59:13.770489 46.32480762 7.36023502 4.813476562 0.745623 46.32480762 7.36023502 4.813476562 0.883018
When performing the test on both streams (i.e. HHN
and HHE
) the "arrivalOffsetThreshold"
was set to 1
(second). This is comparably large, since the goal is to associate arrivals on a single sensor location. Instead, I would rather set the value to a few samples offset (given the data is sampled with 200Hz
). Are you aware of this issue?
Besides, I observed that the number of results mostly depends on the value set for the "triggerDuration"
configuration parameter.
The implementation is rather clumsy: https://github.com/damb/scdetect/blob/52d2d780806a0b122fc08a3f6baf474a0f50804f/src/apps/cc/detector/detector.cpp#L315-L318
In order to make it independent from whether the data is processed in real-time-mode or whether in offline-mode the trigger endtime is compared to the endtime of the most recent record processed from the underlying so-called triggering template waveform processor. Due to the fact that data is fed record-wise it's resolution is basically bound to the
For sure, this is not ideal, though, I didn't come up with a better solution, yet. Do you have a better idea, @mmesim?
@mmesim, how do you determine the expected results? Do you use a threshold (if yes, how large)?
To determine the "expected results" I compare the origin times of the two catalogs. I don't use a threshold, instead I compare origin times to one decimal place for msec, e.g. 2019-11-05T04:01:11.8. I use intersection
command in python to compare the two lists.
When performing the test on both streams (i.e.
HHN
andHHE
) the"arrivalOffsetThreshold"
was set to1
(second). This is comparably large, since the goal is to associate arrivals on a single sensor location. Instead, I would rather set the value to a few samples offset (given the data is sampled with200Hz
). Are you aware of this issue?
Yes, I'm aware. It might be too large, however the problem is not anymore the number of detections. The number of detections has been improved after you fixed #71. The problem now is that the origin times, after joint detection, do not make any sense.
I use
intersection
command in python to compare the two lists.
You mean https://docs.python.org/3/library/stdtypes.html#frozenset.intersection?
Yes
Ok, here an update (tests were performed based on #85):
Configuration | "arrivalOffsetThreshold" |
# detections |
---|---|---|
single stream: HHN , only |
- | 226 |
single stream: HHE , only |
- | 350 |
multi stream: HHN and HHE |
default |
62 |
multi stream: HHN and HHE |
0.0049 (slightly less than a sample) |
62 |
multi stream: HHN and HHE |
0.0051 (slightly more than a sample) |
151 |
multi stream: HHN and HHE |
0.0101 (slightly more than two samples) |
183 |
multi stream: HHN and HHE |
0.0151 (slightly more than three samples) |
197 |
multi stream: HHN and HHE |
0.0201 (slightly more than four samples) |
204 |
multi stream: HHN and HHE |
>= 0.0251 (slightly more than five samples) |
211 |
For sure, this is not ideal, though, I didn't come up with a better solution, yet. Do you have a better idea, @mmesim?
This should be solved with #85, as well.
I confirm the number of detections. "Expected" results and results from multi-channel detections agree! It works!!!
Also, I checked the detections by plotting the waveforms. Everything is fine.
I'm in the process of checking #66.
14.test.zip
The zip file includes the the files I used to run #14.test. The test uses one earthquake as template and the S-phase on the two horizontal components of the same station. It returns 6 detections. However, when the
scdetect
is applied on each channel separately I get 122, for the HHE, and 90 for the HHN. To my understanding the number of detections should be 90 or a little bit less than that.Note: I did not use the attribute
arrivalOffsetThreshold
. It might be a good opportunity to explore that, i.e how should we tune thearrivalOffsetThreshold
to get the right number of detections .