eqcorrscan / EQcorrscan

Earthquake detection and analysis in Python.
https://eqcorrscan.readthedocs.io/en/latest/
Other
166 stars 86 forks source link

filter_picks bug and example mistake, section 4.3.4 #450

Closed dr-glenn closed 3 years ago

dr-glenn commented 3 years ago

In the tutorial example in section 4.3.4 at https://eqcorrscan.readthedocs.io/en/latest/tutorials/template-creation.html. The function mktemplates should have kwarg plot=True as the last arg, not the second, otherwise the suggested command line fails: python template_creation.py NCEDC 72572665. This is a trivial problem,

There appears to be a real bug in filter_picks, arising with the parameter evaluation_mode.

To Reproduce When run with the NCEDC example, I see 10 traces displayed - but there are only 5 traces and each is displayed twice. The reason for this is the unspecified default parameter of filter_picks, evaluation_mode='all'. If the arg is specified as evaluation_mode='automatic', only one trace per pick appears. But if specified as evaluation_mode='manual', no traces are plotted.

Desktop

calum-chamberlain commented 3 years ago

Thanks for that catch on the tutorial @dr-glenn - I will fix that and get it out asap.

Regarding the filter_picks issue you are running into: this appears to be working as expected because there are both manual and automatic picks in the Event. From the docs, filter_pickstakes the:

top_n_picks (int) – Filter only the top N most used station-channel pairs.

When both automatic and manual picks are present and evaluation_mode is set to all (the default as you point out) then all are retained. A template is then created using all the picks. The issue you ran in to when changing the evaluation_mode and ending up with no picks for evaluation_mode="manual" appears to be because you overwrote the catalog with one only containing automatic picks, then tried to extract manual picks from that. If you run:

original = catalog.copy()
catalog_manual = filter_picks(original.copy(), top_n_picks=5, evaluation_mode="manual")
catalog_automatic = filter_picks(original.copy(), top_n_picks=5, evaluation_mode="automatic")

you should find that both events contain five picks.

I don't think this is a bug, but if you have any suggestions to better document this expected behaviour then I would be happy to incorporate them!

dr-glenn commented 3 years ago

No, there really does seem to be a bug. I'm seeing inexplicable behavior. I run with: catalog = filter_picks(catalog, top_n_picks=5, evaluation_mode='all') I see 2 identical traces for each of 5 stations: B039.EHZ, B040.EHZ, B045.EHZ, B046.EHZ, B047.EHZ Change that one line to: catalog = filter_picks(catalog, top_n_picks=5, evaluation_mode='automatic') I am not performing a filter from the previous filter, I'm just changing the code, saving and rerunning. I see 1 trace for each of 5 stations: AOH.EHZ, B039.EHZ, B040.EHZ, B045.EHZ, B046.EHZ Change evaluation_mode to 'manual' and get none. Furthermore, something very strange happens when I made another simple change: manual_catalog = filter_picks(catalog, top_n_picks=5, evaluation_mode='manual') This causes an apparently infinite loop, continually writing this message: 2021-04-05 14:59:41,300 eqcorrscan.core.template_gen INFO Downloading for start-time: 2016-01-02T05:10:16.620000Z end-time: 2016-01-02T06:10:16.620000Z That's so strange - I've just changed the assignment of the return value of filter_picks.

calum-chamberlain commented 3 years ago

Kia Ora,

If all that you have changed is the line:

catalog = filter_picks(catalog, top_n_picks=5)

to:

manual_catalog = filter_picks(catalog, top_n_picks=5, evaluation_mode='manual')

without also changing:

templates = template_gen.template_gen(
        method='from_client', catalog=catalog, client_id=network_code,
        lowcut=2.0, highcut=9.0, samp_rate=20.0, filt_order=4, length=3.0,
        prepick=0.15, swin='all', process_len=3600, plot=plot)

to:

templates = template_gen.template_gen(
        method='from_client', catalog=manual_catalog, client_id=network_code,
        lowcut=2.0, highcut=9.0, samp_rate=20.0, filt_order=4, length=3.0,
        prepick=0.15, swin='all', process_len=3600, plot=plot)

(note the change in variable name to cope with the change from catalog to manual_catalog)

Then this will be trying to download data for all 281 picks - while this is not infinite, I imagine it would seem infinite until you reach pick 281. It would be worth checking that you also updated the argument in the call to template_gen. If you are just editting and re-running the code you do not need to change the variable name catalog to manual_catalog.

dr-glenn commented 3 years ago

Oops, that was stupid of me. However, I changed the return from catalog to manual_catalog only after failing as I initially reported. So let's backup so I can understand what's happening here. One line in the original example: catalog = filter_picks(catalog, top_n_picks=5, evaluation_mode='all') Run, then change to 'automatic' and 'manual' and rerun each time. 'all' results in 10 traces, 2 copies for each of 5 stations. 'automatic' results in 5 traces and some different stations than the 'all' case. I think I understand these 2 results. But why does 'manual' return none?

calum-chamberlain commented 3 years ago

Filter picks does return five picks, however it happens that the manual picks returned are on network NP, stations 1584, 1023, 1582 and 1581. It looks like these stations might be triggered.

The output I get is:

2021-04-06 18:46:04,415 eqcorrscan.core.template_gen    INFO    Downloading data
2021-04-06 18:46:04,415 eqcorrscan.core.template_gen    INFO    Downloading for start-time: 2016-01-02T05:10:16.620000Z end-time: 2016-01-02T06:10:16.620000Z
2021-04-06 18:46:05,188 eqcorrscan.core.template_gen    INFO    Downloading for start-time: 2016-01-02T05:10:16.620000Z end-time: 2016-01-02T06:10:16.620000Z
2021-04-06 18:46:05,891 eqcorrscan.core.template_gen    INFO    Downloading for start-time: 2016-01-02T05:10:16.620000Z end-time: 2016-01-02T06:10:16.620000Z
2021-04-06 18:46:06,597 eqcorrscan.core.template_gen    INFO    Downloading for start-time: 2016-01-02T05:10:16.620000Z end-time: 2016-01-02T06:10:16.620000Z
2021-04-06 18:46:07,347 eqcorrscan.core.template_gen    INFO    Downloading for start-time: 2016-01-02T05:10:16.620000Z end-time: 2016-01-02T06:10:16.620000Z
2021-04-06 18:46:08,101 eqcorrscan.core.template_gen    WARNING Data for 1023.HNZ is 0.06290416666666666 hours long, which is less than 80 percent of the desired length, will not use
2021-04-06 18:46:08,101 eqcorrscan.core.template_gen    WARNING Data for 1581.HNZ is 0.062219444444444445 hours long, which is less than 80 percent of the desired length, will not use
2021-04-06 18:46:08,102 eqcorrscan.core.template_gen    WARNING Data for 1582.HNZ is 0.062325 hours long, which is less than 80 percent of the desired length, will not use
2021-04-06 18:46:08,102 eqcorrscan.core.template_gen    WARNING Data for 1584.HNN is 0.06320555555555556 hours long, which is less than 80 percent of the desired length, will not use
2021-04-06 18:46:08,102 eqcorrscan.core.template_gen    WARNING Data for 1584.HNZ is 0.06320555555555556 hours long, which is less than 80 percent of the desired length, will not use
2021-04-06 18:46:08,102 eqcorrscan.core.template_gen    INFO    Pre-processing data
2021-04-06 18:46:08,102 eqcorrscan.core.template_gen    INFO    No data

which suggest that filter_picks correctly returns these picks, but there is insufficient data available for them - perhaps these stations are triggered stations?

So: filter_picks seems to be working correctly, the behaviour of filter_picks can be a little off when only one event is included in the catalog and top_n_picks is used, because all station and channel combinations are likely to only be picked once.