GeoNet / help

An issues repo for technical help questions.
6 stars 3 forks source link

Incomplete waveform data returned from obspy FDSN bulk query #120

Open calum-chamberlain opened 7 months ago

calum-chamberlain commented 7 months ago

Kia ora team, not totally sure that this is a GeoNet issue rather than an obspy one, and I would appreciate an expert eye over it.

I am trying to get data from a few stations using the FDSN client using the obspy implementation and the get_waveforms_bulk method. I noticed that some data were not being returned when I was expecting them. Example below:

from obspy import UTCDateTime
from obspy.clients.fdsn import Client

client = Client("GEONET", debug=True)
stime, etime = UTCDateTime('2016-11-13T11:05:04.415558Z'),  UTCDateTime('2016-11-13T11:06:34.415558Z')
bulk = [
    ('NZ',  'HSES',  '*',  '*',  stime, etime),
    ('NZ',  'WIGC',  '*',  '*',  stime, etime),
    ('NZ',  'CULC',  '*',  '*',  stime, etime),
    ('NZ',   'CECS',  '*',  '*',  stime, etime),
    ('NZ',  'GVZ',  '*',  '*',  stime, etime),
    ('NZ',  'KHZ',  '*',  '*',  stime, etime),
    ('NZ',  'WAKC',  '*',  '*',  stime, etime),
    ('NZ',  'KIKS',  '*',  '*', stime, etime),
    ('NZ',  'MOLS',  '*',  '*', stime, etime),
    ('NZ',  'LTZ',  '*',  '*', stime, etime),
    ('NZ',  'AMCZ',  '*',  '*', stime, etime),
    ('NZ',  'SJFS',  '*',  '*', stime, etime),
    ('NZ',  'ASHS',  '*',  '*', stime, etime),
    ('NZ',  'SMHS',  '*',  '*', stime, etime),
    ('NZ',  'THZ',  '*',  '*', stime, etime),
    ('NZ',  'KEKS',  '*',  '*', stime, etime),
    ('NZ',  'WVFS',  '*',  '*', stime, etime),
    ('NZ',  'OXZ',  '*',  '*', stime, etime),
    ('NZ',  'BSWZ',  '*',  '*', stime, etime),
    ('NZ',  'WDFS',  '*',  '*', stime, etime),
    ('NZ',  'APPS',  '*',  '*', stime, etime),
    ('NZ',  'CSHS',  '*',  '*', stime, etime),
    ('NZ',  'SEDS',  '*',  '*', stime, etime),
    ('NZ',  'CMWZ',  '*',  '*', stime, etime),
    ('NZ',  'BTWS',  '*',  '*', stime, etime)]

st = client.get_waveforms_bulk(bulk)

assert "BTWS" in {tr.stats.station for tr in st}
# Should raise an assertion error because BTWS is not in the stream

st2 = client.get_waveforms_bulk(bulk[-2:])
assert "BTWS" in {tr.stats.station for tr in st2}
# Should pass

Excuse the length, for some reason some stations are not returned with the larger bulk, but are available with smaller bulks. No errors are raised. Apologies if this is an obspy issue, I'm not sure how to replicate without obspy for the bulk methods.

Thomas-Benson commented 7 months ago

Hi Calum, From having a quick look, the issue appears to be related to strong motion channels from KHZ. There shouldn't be any data for these streams for the time range you are selecting, but FDSN appears to be returning an empty miniseed record rather than a no data code. We'll have a look into why this is happening, but for now you should be able to get the rest of the data you need by excluding these streams.

Thomas-Benson commented 7 months ago

For context, on 2016-11-13 this is the strong motion data that should be available for KHZ:

   Source                Start sample             End sample        Hz  Samples
NZ_KHZ_20_HNZ     2016,318,22:00:40.984536 2016,318,22:02:20.979536 200 20000
NZ_KHZ_20_HNZ     2016,318,23:15:29.984538 2016,319,00:00:01.974536 200 534399
Total: 1 trace(s) with 2 segment(s)
calum-chamberlain commented 7 months ago

Thanks for that @Thomas-Benson - I assume that obspy is then truncating the record at the first empty record and skipping the remainder of the data?