mSEED record 512 bytes hardcoded

EIDA / mediatorws

EIDA NG Mediator/Federator web services

GNU General Public License v3.0

6 stars 6 forks source link

mSEED record 512 bytes hardcoded #72

Closed Jollyfant closed 3 years ago

Jollyfant commented 5 years ago

Browsing through the request handlers I found this: https://github.com/EIDA/mediatorws/blob/e55d335b92124c70a21a74f0c864ac9856e488a1/eidangservices/federator/server/task.py#L593

mSEED records do not have to be 512 bytes. I'm not sure how critical this assumption is in the code..

damb commented 5 years ago

Thanks @Jollyfant for the review and pointing this out.

At the moment the assumption is only relevant when eida-federator performs splitting and alignment of mSEED data. However, a solution reading the record size of the first record of the subsequent stream epoch and comparing with the last record of the previous stream epoch would be more stable (assuming that only overlaps with a single mSEED record can occur).

Although, I don't know how relevant handling a dynamic mSEED record size within EIDA practically is ... Maybe @andres-h can serve with more experience here. Thanks in advance.

kaestli commented 5 years ago

i am not sure how widespread non-512 byte mseed records are. The current situation is as follows:

if a stream is shipped in non-512 byte record, and
if the user requests data from the federator including an amount of data from this single stream bypassing request size limits of the EIDA endpoint, and
if requesting two adjacent time windows from this stream triggers the endpoint to actually ship overlapping time windows (one or multiple records available in the answer for both requests, then the federator will not detect and correct this overlap.

Jollyfant commented 5 years ago

i am not sure how widespread non-512 byte mseed records are.

We are considering repacking all records to 4096 bytes for archiving and this is commonly done and recommended by IRIS. It saves like 10% of storage space.

If the worst thing that happens is that some overlap is introduced in the request that is fine TBH. I was just afraid it would crash. Maybe an enhancement would be to jump back 4096 (or a higher power of 2) back from the end of the stream so at least data UP TO that record length are correctly handled.

kaestli commented 5 years ago

sure this is something to fix. i was just wondering about the urgency. thus, we'll probably do it after mid of next week :-)

damb commented 3 years ago

Fixed at https://github.com/damb/eidaws. The record size is evaluated based on the record size of the very first mseed record received. Subsequent records are supposed to ship data with the same record size.