swiss-seismological-service / scdetect

A computationally efficient earthquake detection module for SeisComP
https://scdetect.readthedocs.io
GNU Affero General Public License v3.0
15 stars 6 forks source link

Implement `MLx` integration tests #75

Closed damb closed 2 years ago

damb commented 2 years ago

Cover both the MLx amplitude and the MLx magnitude implementation via a basic set of integration tests.

Coverage must at least include:

Note that this task requires #74 to be merged.

damb commented 2 years ago

@mmesim, would you like to implement these tests? You can follow the approach for the MRelative tests.

If you have questions, let me know.

mmesim commented 2 years ago

I will post the files here.

I was trying to do

Single detector, single stream

and got this warning and no amplitudes

13:00:33 [warning] [detector-01::4e60605b-503d-44e3-9dbe-037ff2248d9f] [8D.RAW2..HH] failed to look up deconvolution configuration related bindings (channel code: "HHE") required for amplitude processor configuration (lookup failed (chaCode=HHE)); using fallback configuration, instead: "enabled: 1, responseTaperLength: 5.000000s, minimumResponseTaperFrequency: 0.008333Hz, maximumResponseTaperFrequency: 0.000000Hz"

What is this now?? :roll_eyes: :eyes: :eyes: :eyes: 02.MLx.singleTemplate.singleStream.zip

Also expected 2 detections but returned 1.

2019-11-05T04:00:37.625489Z  46.3248    7.3602    4.81 0.57 -- detected
2019-11-05T04:00:48.530489Z  46.3248    7.3602    4.81 0.58 -- missed

Perhaps this is similar to #87 ?

Note: I was able to get MRelative Amplitudes & Magnitudes. Not MLx though.

damb commented 2 years ago

Also expected 2 detections but returned 1.

Yes. Most probably.

damb commented 2 years ago

What is this now??

Recall, MLx requires both horizontal components to be available. Though, waveform data for 8D.RAW2..HHN is provided, only.

mmesim commented 2 years ago

Same error even though I have both channels in the mseed 02.MLx.singleTemplate.singleStream.zip.

damb commented 2 years ago

Same error even though I have both channels in the mseed 02.MLx.singleTemplate.singleStream.zip.

Do you provide the template waveform data (horizontal components) regarding the sensor location 8D.RAW2..HH (Sg phase) for the origin with the id smi:ch.ethz.sed/sc3a/origin/NLL.20191105125505.255283.1897990? I assume not.

mmesim commented 2 years ago

Nope. Sorry I let scdetect-cc create the template.

mmesim commented 2 years ago

@damb Question, Does https://github.com/damb/scdetect/issues/75#issuecomment-1041446869 this work for you if you include the template? I tried to do

$ scdetect-cc \ --templates-json path/to/templates.json \ --inventory-db file:///absolute/path/to/inventory.scml \ --event-db file:///absolute/path/to/catalog.scml \ --record-url fdsnws://eida-federator.ethz.ch/fdsnws/dataselect/1/query \ --offline \ --templates-prepare

But still do not get amplitudes for this test.

damb commented 2 years ago

AFAIK data for 8D.RAW2.*.* is not publicly available:

$ curl -v -o - "http://eida-federator.ethz.ch/fdsnws/station/1/query?net=8D&sta=RAW2&format=text"
*   Trying 129.132.148.115:80...
* Connected to eida-federator.ethz.ch (129.132.148.115) port 80 (#0)
> GET /fdsnws/station/1/query?net=8D&sta=RAW2&format=text HTTP/1.1
> Host: eida-federator.ethz.ch
> User-Agent: curl/7.76.1
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 204 No Content
< Server: nginx/1.18.0 (Ubuntu)
< Date: Wed, 16 Feb 2022 16:01:07 GMT
< Content-Type: text/plain; charset=utf-8
< Connection: keep-alive
< 
* Connection #0 to host eida-federator.ethz.ch left intact

If you'd like to use fdsnws-dataselect you're forced to use arclink.ethz.ch from within the ETH network.

damb commented 2 years ago

What is the log output if you run your command from above with the --debug flag enabled? You should see at least some warnings.

mmesim commented 2 years ago

Yes, yes. I forgot that RAW2 is not public.

mmesim commented 2 years ago

:heart:

mmesim commented 2 years ago

Ok, two things.

(1) I still do not get amplitudes even though I downloaded the templates. I confirm that the templates are real, plotted the waveforms.

(2) It seems that

Also expected 2 detections but returned 1.

is fixed. I get 3 detections. I don't know if this was related to the template being a ghost!

damb commented 2 years ago

(1) I still do not get amplitudes even though I downloaded the templates. I confirm that the templates are real, plotted the waveforms.

I'll look into this tomorrow.

damb commented 2 years ago

(1) I still do not get amplitudes even though I downloaded the templates. I confirm that the templates are real, plotted the waveforms

Ok. I've got it already: it is the filter configuration of the MLx amplitude processor. You configured 60s (i.e. this is the default). This means for the detection the required amplitude processor time window is:

However, the data chunk you provided does not include the data.

If you check the logs you should get something like:

17:18:41 [debug] [8D.RAW2..HHE] Removing time window processor: id=detector-01::7763d735-605a-428a-bc1d-7d5822077fc0, status=101, status_value=0.000000
17:18:41 [debug] Current time window processor count: 1
17:18:41 [debug] [8D.RAW2..HHN] Removing time window processor: id=detector-01::7763d735-605a-428a-bc1d-7d5822077fc0, status=101, status_value=0.000000
17:18:41 [debug] Current time window processor count: 0

A status of 101 implies that something went wrong. (I'm aware that this is not very intuitive; this should be improved.)

To conclude: Provide a shorter filter initialization time (e.g. 10s). Then it should work.

mmesim commented 2 years ago

@damb Thanks!! It works. You can use that example for the testing suite.

Question: Is it ok if I upload two more examples?

Single detector, multi stream (referring to a single sensor location) Single detector, multi stream (multiple sensor locations)

damb commented 2 years ago

Question: Is it ok if I upload two more examples?

Of course.

mmesim commented 2 years ago

Single detector, single stream 02.MLx.singleTemplate.singleStream.zip

Single detector, multi stream (referring to a single sensor location)

03.MLx.singleTemplate.multiStream.zip

Single detector, multi stream (multiple sensor locations)

04.MLx.singleTemplate.multiStream.Loc.zip

These are successful tests. Though, when I used a multitemplate family configuration it couldn't calculate magnitudes. I don't know why this happened. The same files worked for a different case (1hr data).

Anyways, let's not talk again about MLx. :D

damb commented 2 years ago

@mmesim, make sure you read the logs. Apparently, something went wrong during the template family initialization.

damb commented 2 years ago

Ok, with #92 you should see that template initialization failed due to missing waveform data.

damb commented 2 years ago

@mmesim, an additional note regarding the test configurations. Please make sure that the configurations contain only the data required; not more.

This implies:

This is because:

Also, please format the data properly. I'm fine if you'd like to use a compact format for JSON and XML data. This saves additional space.

mmesim commented 2 years ago

@mmesim, make sure you read the logs. Apparently, something went wrong during the template family initialization.

Yes, I saw the message.

Ok, with https://github.com/damb/scdetect/pull/92 you should see that template initialization failed due to missing waveform data.

I do not understand why.

mmesim commented 2 years ago

Could you please do the cleaning? I'll cook!

damb commented 2 years ago

I do not understand why.

Recall, that once you configure the template families with third-party origins the waveform data needs to be available in order to create the regression sample. This is not the case, though.

mmesim commented 2 years ago

Oh, I thought it will get the amplitudes from the template.scml . :confused:

damb commented 2 years ago

Nope. Instead, It computes MLx amplitude regression samples. This way, it is guaranteed that all regression samples are computed the same way. We shouldn't rely on the catalog.

mmesim commented 2 years ago

This way, it is guaranteed that all regression samples are computed the same way.

Great!!

damb commented 2 years ago

@mmesim, are you still going to fix the test data?

mmesim commented 2 years ago

I don't think so.