jpjones76 / SeisIO.jl

Julia language support for geophysical time series data
http://seisio.readthedocs.org
Other
47 stars 21 forks source link

ungap! doesn't fix a gap before the last sample in S.x[i] if size(S.t[i]) == (2,2) #74

Closed tclements closed 3 years ago

tclements commented 3 years ago

Here's an data/reading problem. I'm not sure if this file is corrupted or is being read incorrectly.

I came across an mseed file where SeisIO says there is a gap in a file but there is no data following that gap:

julia> S = read_data("mseed","/home/timclements/CIBAK__LHZ___2000141.ms")
SeisData with 1 channels (1 shown)
    ID: CI.BAK..LHZ                        
  NAME: CI.BAK..LHZ                        
   LOC: 0.0 N, 0.0 E, 0.0 m                
    FS: 1.0                                
  GAIN: 1.0                                
  RESP: a0 1.0, f0 1.0, 0z, 0p             
 UNITS:                                    
   SRC: /home/timclements/CIBAK__LHZ___20… 
  MISC: 0 entries                          
 NOTES: 1 entries                          
     T: 2000-05-20T00:00:00 (1 gaps)       
     X: +3.900e+01                         
        -3.100e+01                         
            ...                            
        -8.150e+02                         
        (nx = 32860)                       
     C: 0 open, 0 total

julia> S.t[1]
2×2 Array{Int64,2}:
     1  958780800925100
 32860      53539997700

This claims there is a gap of 53539997700 but there isn't any data afterwards.

When I go to ungap these data, the gap is still there

julia> ungap(S)
SeisData with 1 channels (1 shown)
    ID: CI.BAK..LHZ                        
  NAME: CI.BAK..LHZ                        
   LOC: 0.0 N, 0.0 E, 0.0 m                
    FS: 1.0                                
  GAIN: 1.0                                
  RESP: a0 1.0, f0 1.0, 0z, 0p             
 UNITS:                                    
   SRC: /home/timclements/CIBAK__LHZ___20… 
  MISC: 0 entries                          
 NOTES: 2 entries                          
     T: 2000-05-20T00:00:00 (1 gaps)       
     X: +3.900e+01                         
        -3.100e+01                         
            ...                            
        -8.150e+02                         
        (nx = 32860)                       
     C: 0 open, 0 total

because ungap! only checks if the size of the time matrix in the first dimension is 2, so this isn't caught.

This could create a problem when trying to determine the endtime of the file

julia> SeisIO.endtime(S.t[1],S.fs[1]) * SeisIO.μs |> u2d
2000-05-20T23:59:59.923

as it should be

julia> S.t[1][2,2] = 0 
0

julia> SeisIO.endtime(S.t[1],S.fs[1]) * SeisIO.μs |> u2d
2000-05-20T09:07:39.925

We might want to check that t[2,2] == 0 when ungapping to catch this sort of thing. Here is the file:

mseed-file.zip

jpjones76 commented 3 years ago

Working on this now. I don't think there's an error with the mini-SEED reader; everything I've used to analyze the file suggests that the last packet has one sample.

I can guess why the one-sample case occurs. The channel went offline at around 9 AM that day; it looks like data transmission resumed exactly at midnight (presumably the network did the latter manually). So, why one sample? Well, I'd bet that there's a small time correction by the seismic network. At fs = 1.0 Hz, if the sample times are corrected by at least 0.076 s, you'll have exactly one sample on the previous day ... and subsequently, it'll be included in a data request that ends at midnight.

jpjones76 commented 3 years ago

I think the most recent commits fix this. Check to verify; if so, I can close this.

jpjones76 commented 3 years ago

Fixed in SeisIO v1.2.0.