usgs / groundmotion-processing

Parsing and processing ground motion data
Other
54 stars 42 forks source link

Layout of Auxiliary Data in ASDF file #78

Closed baagaard-usgs closed 5 years ago

baagaard-usgs commented 5 years ago

Format of auxiliary data (XML vs JSON)

XML

Pros

Cons

JSON

Pros

Cons

I strongly prefer XML over JSON for consistency in the ASDF layout even if it adds some complication to reading/writing. In the long term it is much easier to change software interfaces than migrate data files to new formats.

Note: Even if we don't support units in the Python code, we can hardcode the units when writing and validate when reading.

Station Metrics

StationMetrics (group) -> NET.STA (group) -> NET.STAEVENTIDTAG (dataset)

NET: FDSN network code (or equivalent) STA: Station code EVENTID: ComCat event id (or equivalent) TAG: Tag associated with processing to compute metrics

Store the station metrics as key/value pairs with units as attributes.

<station_metrics>
  <hypocentral_distance units="km">10.2</hypocentral_distance>
  <epicentral_distance units="km">2.3</epicentral_distance>
</station_metrics>

Waveform Metrics

WaveformMetrics (group) -> NET.STA (group) -> *NET.STA.LOCSTARTENDWTAGTAG** (dataset)

NET: FDSN network code (or equivalent) STA: Station code LOC: Location code START__END: Time history start/end tags from Waveforms dataset. WTAG: Tag associated with waveform processing TAG: Tag associated with computing metrics

We do not include the channel code, because many metrics involve multiple channels (horizontal components). Instead the components are included in the metrics as attributes.

Store the waveform metrics as key/value pairs grouping first by intensity metric type (for example RotD50) and then intensity metric (for example PGA). Include units via attributes.

<waveform_metrics>
  <rot_d50>
    <pga units="m/s**2">0.45</pga>
    <sa percent_damping="5.0" units="g">
      <value period="2.0">0.2</value>
  </rot_d50>
  <maximum_component>
  </maximum_component>
</waveform_metrics>

Alternative "array format" would be

<value><period>2.0</period><amplitude>0.2</amplitude></value>

It is easier to pull out the amplitude for a specific period if it is stored as an attribute (first case).

mhearne-usgs commented 5 years ago

With the addition of station metrics in PR #293, this issue has been addressed. We will still need to implement other distance metrics in 1.1