jpjones76 / SeisIO.jl

Julia language support for geophysical time series data
http://seisio.readthedocs.org
Other
47 stars 21 forks source link

writesac writes incorrect time #30

Closed tclements closed 4 years ago

tclements commented 4 years ago

Here is a SeisData that starts at 00:48:59.84

S = read_data("sac",files[ii])
SeisData with 1 channels (1 shown)
    ID: BP.SCYB..BP1                       
  NAME:                                    
   LOC: 36.0094 N, -120.537 E, 695.0 m     
    FS: 20.0                               
  GAIN: 7.13026e10                         
  RESP: a0 1.0, f0 1.0, 0z, 0p             
 UNITS:                                    
   SRC: /home/timclements/BPtest/SAC/2004… 
  MISC: 0 entries                          
 NOTES: 1 entries                          
     T: 2004-01-01T00:48:59.084 (0 gaps)   
     X: -5.628e+03                         
        -5.624e+03                         
            ...                            
        -5.404e+03                         
        (nx = 1209785)                     
     C: 0 open, 0 total

S[1].t
2×2 Array{Int64,2}:
       1  1072918139083600
 1209785                 0

If I then use writesac and read the new file, the new timing is off by a millisecond:

writesac(S)
Snew = read_data("sac","2004.001.00.48.59.084.BP.SCYB..BP1.R.SAC")
SeisData with 1 channels (1 shown)
    ID: BP.SCYB..BP1                       
  NAME:                                    
   LOC: 36.0094 N, -120.537 E, 695.0 m     
    FS: 20.0                               
  GAIN: 7.13026e10                         
  RESP: a0 1.0, f0 1.0, 0z, 0p             
 UNITS:                                    
   SRC: 2004.001.00.48.59.084.BP.SCYB..BP… 
  MISC: 0 entries                          
 NOTES: 1 entries                          
     T: 2004-01-01T00:48:59.085 (0 gaps)   
     X: -5.628e+03                         
        -5.624e+03                         
            ...                            
        -5.404e+03                         
        (nx = 1209785)                     
     C: 0 open, 0 total
Snew[1].t
2×2 Array{Int64,2}:
       1  1072918139084600
 1209785                 0

The snippet string(u2d(ts*μs)) in the fill_sac function (line 47 of SAC.jl) rounds to the nearest millisecond. The regular expression then pulls the rounded millisecond value. In the above case, this leads to a 1 millisecond offset.

Getting the time using string leads to a problem when the channel (or a gap) starts at a time with less than 3 significant millisecond digits (e.g. 100 or 250 milliseconds). Here is an example where I change the start time to 48:59.1 :

S[1].t[1,2] = 1072918139100000 #round to nearest time on sampling rate (0.1s)
ts = S[1].t[1,2]
μs = 1e-6
split(string(u2d(ts*μs)), r"[\.\:T\-]") # from fill_sac
7-element Array{SubString{String},1}:
 "2004"
 "01"  
 "01"  
 "00"  
 "48"  
 "59"  
 "1" 

Note that the 7th element in the Array is 1 instead of 100.

If I now write the S with the starttime changed to 00:48:59.1, then read

Snew2 = read_data("sac","2004.001.00.48.59.001.BP.SCYB..BP1.R.SAC")
SeisData with 1 channels (1 shown)
    ID: BP.SCYB..BP1                       
  NAME:                                    
   LOC: 36.0094 N, -120.537 E, 695.0 m     
    FS: 20.0                               
  GAIN: 7.13026e10                         
  RESP: a0 1.0, f0 1.0, 0z, 0p             
 UNITS:                                    
   SRC: 2004.001.00.48.59.001.BP.SCYB..BP… 
  MISC: 0 entries                          
 NOTES: 1 entries                          
     T: 2004-01-01T00:48:59.001 (0 gaps)   
     X: -5.628e+03                         
        -5.624e+03                         
            ...                            
        -5.404e+03                         
        (nx = 1209785)                     
     C: 0 open, 0 total

The new file starts at 00:48:59.001. I think the only change needed here is getting the floor of the milliseconds in the start time with something in fill_sac similar to tt[7] = Millisecond(u2d(floor(ts/1000)/1000)).value after line 49 in SAC.jl.

jpjones76 commented 4 years ago

Part of this might be a bug, but it isn't caused by rounding to the nearest millisecond.

The limit of SAC time accuracy is milliseconds. If I switch from round to floor, you'll have the opposite problem with a time remainder in the range 0.5 ms <= t < 1 ms. With round the start time precision is +/-0.5 ms, which is the hard limit of the file format. If I switch to div or floor the start time precision becomes +/-(1-eps(Float32)) ms, or about a factor of two worse.

I'll check into why writesac is introducing a drift with that particular file. My tests check for that and don't encounter that problem.

However, rounding to the nearest ms absolutely isn't a mistake.

jpjones76 commented 4 years ago

Should be fixed on master now. I think the cause was how string(DateTime(t)) has changed since Julia v0.4; my SAC and SEGY functions predate SeisIO by months, so sometimes we find blobs of "legacy" syntax. I'm pretty sure that Julia had versions in which string(DateTime(t)) always returned a three-digit ms substring; evidently, in Julia >= 1.0, it doesn't even print an ms value of 0.

Dates has more breaking inconsistencies than this, by the way. Some (e.g. time zones) are much worse.