mspass-team / mspass

Massive Parallel Analysis System for Seismologists
https://mspass.org
BSD 3-Clause "New" or "Revised" License
30 stars 12 forks source link

Inconsistency in handling data read errors in _read_data_from_dfile #549

Open wangyinz opened 3 months ago

wangyinz commented 3 months ago

Read the code below first. https://github.com/mspass-team/mspass/blob/bf7e2f05609888b56a238095ff2aa3527e304694/python/mspasspy/db/database.py#L4288-L4324 Basically, the data got killed when the endtime doesn't match. The elog message claims that such data will be killed. However, it was later being set alive because the npts is greater than 0, which is mostly true because otherwise it won't be able to calculate the endtime. I can see inside the database the corresponding elog appears to be:

{
   "_id":{
      "$oid":"6670faebf7b0b388f4b5bcd7"
   },
   "logdata":[
      {
         "job_id":{
            "$numberInt":"0"
         },
         "algorithm":"Database._read_data_from_dfile:  ",
         "badness":"ErrorSeverity.Invalid",
         "error_message":"Inconsistent endtimes detected\nEndtime expected from MongoDB document = 2011-04-07T15:17:43.975000Z\nEndtime set by obspy reader = 2011-04-07T15:17:43.975000Z\nEndtime is derived in mspass and should have been repaired - cannot recover this datum so it was killed",
         "process_id":{
            "$numberInt":"3266"
         }
      }
   ],
   "data_tag":"serial_preprocessed",
   "wf_TimeSeries_id":{
      "$oid":"6670f5c8131a86997a8e1d1e"
   }
}

Note that this record is from the earthscope2024 notebook here.

I think all we need is probably change the wording of the elog message to reflect that the error is potentially harmless. Or, we should refine the checking here such that errors within a certain threshold is acceptable (which is the case above).

pavlis commented 3 months ago

Good catch. How did you detect that? I see you checked in revised notebooks but is there a fix to database?

On Jun 17, 2024, at 11:34 PM, Ian Wang @.***> wrote:

This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.

Read the code below first. https://github.com/mspass-team/mspass/blob/bf7e2f05609888b56a238095ff2aa3527e304694/python/mspasspy/db/database.py#L4288-L4324 Basically, the data got killed when the endtime doesn't match. The elog message claims that such data will be killed. However, it was later being set alive because the npts is greater than 0, which is mostly true because otherwise it won't be able to calculate the endtime. I can see inside the database the corresponding elog appears to be:

{ "_id":{ "$oid":"6670faebf7b0b388f4b5bcd7" }, "logdata":[ { "job_id":{ "$numberInt":"0" }, "algorithm":"Database._read_data_from_dfile: ", "badness":"ErrorSeverity.Invalid", "error_message":"Inconsistent endtimes detected\nEndtime expected from MongoDB document = 2011-04-07T15:17:43.975000Z\nEndtime set by obspy reader = 2011-04-07T15:17:43.975000Z\nEndtime is derived in mspass and should have been repaired - cannot recover this datum so it was killed", "process_id":{ "$numberInt":"3266" } } ], "data_tag":"serial_preprocessed", "wf_TimeSeries_id":{ "$oid":"6670f5c8131a86997a8e1d1e" } } Note that this record is from the earthscope2024 notebook here https://github.com/mspass-team/mspass_tutorial/blob/master/Earthscope2024/Session1.ipynb.

I think all we need is probably change the wording of the elog message to reflect that the error is potentially harmless. Or, we should refine the checking here such that errors within a certain threshold is acceptable (which is the case above).

— Reply to this email directly, view it on GitHub https://github.com/mspass-team/mspass/issues/549, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABNEJ42MP7LKD6HY2GI5VDZH6TD7AVCNFSM6AAAAABJPFF7BCVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM2TQOBQGU2TCMQ. You are receiving this because you are subscribed to this thread.

wangyinz commented 3 months ago

I found it by looking at the records in the elog collection in the database. This mismatch error actually happened pretty frequently from that small dataset. Anyway, I don't think this needs immediate attention, but we do need to fix it. I am just opening the issue here as a bookmark.

pavlis commented 3 months ago

I think on the test data set it is related to an error thrown but handled on one of the files. This problem is fundamentally created by the fact we are using obspy's reader to crack miniseed files when running the read_data method of Database but the index read_data uses is created by our custom reader that only indexes the files. Since both use the same underlying miniseed library a working hypothesis is that some packets in the files have intact headers but the compressed data section of some packets have errors that make them impossible to decompress. I think obspy's reader will truncate any trace with such an error at the end of the previous packet. That would explain the behavior if my guess is correct.