Channel codes for Synthetics

krischer commented 7 years ago

Discussion branched off #2. Concerns DRAFT20170622.

@krischer

Section 6: Definition of channel codes

I would like to define the X band code for any synthetic data. Currently X is only defined for the instrument code but this limits it to translational synthetic data. There is currently no option to for example specify synthetic rotational or strain channels. Here are some of my notes of how we currently deal with it in a new code of ours:

- Always use “X” as the band code - its currently not used and
we just claim it for “synthetics”. The current FDSN conventions
just don’t really work for synthetics so this should be fine.

# Acoustic - final “A” would stand for acoustic

* displacement: XDA + tag “displacement”
* velocity: XVA + tag “velocity”
* acceleration: XAA + tag “acceleration”

# Elastic

## Receivers with rotation matrix:

* displacement: XD[ZNE] + tag “displacement”
* velocity: XV[ZNE] + tag “velocity”
* acceleration: XA[ZNE] + tag “acceleration”
* strain: XS[0-5] in Voigt notation + tag “strain”
* gradient: X[ZZ-EE] + tag “gradient”

## Receivers without rotation matrix:

* displacement: XD[012] + tag “displacement”
* velocity: XV[012] + tag “velocity”
* acceleration: XA[012] + tag “acceleration”
* strain: XS[0-5] in Voigt notation + tag “strain” # Not unique and the
# same as for receivers with rotation matrix - not sure what to do here.
* gradient: X[00-22] + tag “gradient”

@chad-iris

Sounds like a decent proposal to me. Using an X band eliminates the ability to denote the band, but it's a course definition anyway.

Since that is a completely new channel definition I suggest this goes to FDSN WG II as a proposal and not something we conflate with the format specification. There is already a lot of new format layout conflated with new format semantics, I suggest separating what we can.

@crotwell

For synthetic data, we now have the option of longer codes, so "real" data channel codes should be limited to 3-4 characters, but synthetic or other can be longer, prefixed with X, so XBHZ or XLSN. Then even a new instrument code that was "JK" could be synthetic with BJKN mapping to XBJKN? This kind of matches Lion's idea, except make is explicit that band of X means that it is synthetic amd that the rest of the channel code can be interpreted along standard channel naming conventions, or is undefined by the spec? The restriction of short codes maybe makes sense for "real" data, but maybe should be relaxed for synthetic or highly processed data, thinking of miniseed of stacked data for example.

krischer commented 7 years ago

For synthetic data, we now have the option of longer codes, so "real" data channel codes should be limited to 3-4 characters, but synthetic or other can be longer, prefixed with X, so XBHZ or XLSN. Then even a new instrument code that was "JK" could be synthetic with BJKN mapping to XBJKN? This kind of matches Lion's idea, except make is explicit that band of X means that it is synthetic amd that the rest of the channel code can be interpreted along standard channel naming conventions, or is undefined by the spec? The restriction of short codes maybe makes sense for "real" data, but maybe should be relaxed for synthetic or highly processed data, thinking of miniseed of stacked data for example.

This might work but it also might be a bit impractical. The main reason is that synthetic data usually has no instrument and band (and the sampling rate is usually very disconnected from the frequency content) per se but just directly generates data in some unit. Your proposal for example would make it very hard to distinguish displacement or velocity translational waveforms.

My proposal could probably condensed to: If the channel code starts with X it is synthetic data - the other characters are application defined with a recommendation to use the last one as the orientation code if applicable.

krischer commented 7 years ago

Since that is a completely new channel definition I suggest this goes to FDSN WG II as a proposal and not something we conflate with the format specification. There is already a lot of new format layout conflated with new format semantics, I suggest separating what we can.

This is then probably true for the full section with the FDSN identifiers as well as the channel codes? Should we refactor this into a separate document?

crotwell commented 7 years ago

Yep, I agree the X means synthetic and the remainder of the channel code is application defined makes sense. There is no harm in calling it XBHZ if the user wants, but can call it X123 if they want.

Another synthetic "not a real channel" case is greens functions where you want to name it according to the moment tensor components. Given the complexity, having a X and then you are on your own seems best.

chad-earthscope commented 7 years ago

This is then probably true for the full section with the FDSN identifiers as well as the channel codes? Should we refactor this into a separate document?

Yes, that is the plan, all of identifier bits need to be shared with a StationXML specification. I'll break it up in the next draft.

crotwell commented 7 years ago

Just an idea, should the identifier string for synthetic data be different and not force the whole net/sta/loc/chan structure? So FDSN:..[:] is a real channel, and there is another for synthetic, say SYN:, perhaps with some usage guidelines?

Might cause more trouble than it is worth, but lots of synthetic data doesn't really match the net.sta.chan idea, and so this could be an opportunity to make use of the urn flexibility and stop trying to put the square peg in the round hole?

krischer commented 7 years ago

Just an idea, should the identifier string for synthetic data be different and not force the whole net/sta/loc/chan structure? So FDSN:..[:] is a real channel, and there is another for synthetic, say SYN:, perhaps with some usage guidelines?

Might cause more trouble than it is worth, but lots of synthetic data doesn't really match the net.sta.chan idea, and so this could be an opportunity to make use of the urn flexibility and stop trying to put the square peg in the round hole?

Seems like a good idea but it might be a bit unpractical at least for the foreseeable future as people will take some getting used to the idea of the URNs. Many people calculate synthetics to compare them to observed data at one point. Giving both the same name makes it quite simple to figure out what should be compared with what.

I think the X prefix solves most of the current issues and any further definition is bound to fail as there are just too many possibilities for synthetic data.

iris-edu / mseed3-evaluation

Channel codes for Synthetics #7