PeterMakus / SeisMIC

SeisMIC is a python software suite to monitor velocity changes using ambient seismic noise.
https://petermakus.github.io/SeisMIC/
European Union Public License 1.2
42 stars 11 forks source link

Question about data gaps in computing correlations #71

Closed imme-w closed 1 day ago

imme-w commented 5 days ago

Hi Peter,

I'm trying to compute correlations for the year 2021. In my params file I thus have read_start: '2021-01-01 00:00:00.0' read_end: '2022-01-01 00:00:00.0' The only problem is, one of my stations is missing some days (172 - 175), see below: image If I try to compute my correlations it stopts after a while and I get the error message: image which makes sense because day 172 doesn't exist. Is there any way I can continue the correlation computation for 2021 without the missing days?

Thanks a lot!

Imme

PeterMakus commented 3 days ago

Hi Imme, Which version of SeisMIC are you using? Is that the one from the dev branch? Because the standard behaviour would be that it just skips the missing files but maybe this was caused by a recent change and I just haven't caught it. Peter

PeterMakus commented 3 days ago

Hi again, I think I answered the question myself. I suppose this was the dev branch? Because then I understood why this error occurred and it's fixed now in the version (just pull again from GitHuB and it should work now). Let me know if there should be further issues/if that didn't fix it.

Peter

imme-w commented 3 days ago

Hi Peter, thanks for your answer! However I am not using the dev branch at the moment, sorry for not mentioning it before in my question. I'm not exactly sure which version from the non dev branch I am using but I installed it on april 22.

PeterMakus commented 3 days ago

Ok. That's odd. I tried to reproduce the issue by just deleting a file of the tutorial in the middle but it works as expected and just skips the missing day. Are you just executing? Correlator.pxcorr()? Could you send me your params.yaml? (specifically, the co (correlation) part of the parameters)

PeterMakus commented 3 days ago

also interesting. Which version of obspy do you have installed? As the error is actually thrown by an obspy fucntion.

imme-w commented 3 days ago

Hi Peter, thanks for thinking with me! I just tried the tutorial with some deleted data as well, and indeed it works. So there is a problem with my code and not with seisMIC. Sorry for the confusion. Here is my cor part of my params file, do you see anything unusual?

co:
    # subdirectory of 'proj_dir' to store correlation
    # type: string
  subdir: 'corr_test'
    # times sequences to read for cliping or muting on stream basis
    # These should be long enough for the reference (e.g. the standard
    # deviation) to be rather independent of the parts to remove
    # type: string
  read_start: '2021-06-01 00:00:00.0'
  read_end: '2022-06-30 00:00:00.0'
    # type: float [seconds]
    # The longer the faster, but higher RAM usage.
    # Note that this is also the length of the correlation batches
    # that will be written (i.e., length that will be 
    # kept in memory before writing to disk)
    # If you are unsure, keep defaults
  read_len: 86398     #23.999 hour 86398
  read_inc: 86400     #24 hours
    # Neither read_len nor read_inc are deciding about the correlation length.

    # New sampling rate in Hz. Note that it will try to decimate
    # if possible (i.e., there is an integer factor from the
    # native sampling_rate)
  sampling_rate: 125
    # Remove the instrument response, will take substantially more time
  remove_response: false

    # Method to combine different traces
  combination_method: 'betweenStations'    # betweenStations, betweenComponents, autoComponents, allSimpleCombinations, allCombinations
    # If you want only specific combinations to be computed enter them here
    # In the form [Net0-Net1.Stat0-Stat1] or [Net0-Net1.Stat0-Stat1.Cha0-Cha1]  --> nothing about location :(
    # This option will only be consider if combination_method == 'betweenStations'
    # Comment or set == None if not in use
  xcombinations: None

    # preprocessing of the original length time series
    # these function work on an obspy.Stream object given as first argument
    # and return an obspy.Stream object.
  preProcessing: [{'function': 'seismic.correlate.preprocessing_stream.stream_filter',
      'args': {'ftype': 'bandpass', 'filter_option': {'freqmin': 0.01, 'freqmax': 124.99}}}]
    # subdivision of the read sequences for correlation
    # if this is set the stream processing will happen on the hourly subdivision. This has the
    # advantage that data that already exists will not need to be preprocessed again
    # On the other hand, computing a whole new database might be slower
    # Recommended to be set to True if:
    # a) You update your database and a lot of the data is already available (up to a magnitude faster)
    # b) corr_len is close to read_len
    # Is automatically set to False if you are computing a completely new db
  preprocess_subdiv: true
    # type: presence of this key
  subdivision:
        # type: float [seconds]
    corr_inc: 3600
    corr_len: 3600
        # recombine these subdivisions
        # unused at the time
        # type: boolean
    recombine_subdivision: true
        # delete
        # type: booblean
    delete_subdivision: false

    # parameters for correlation preprocessing
    # Standard functions reside in seismic.correlate.preprocessing_td and preprocessing_fd, respectively
  corr_args: {'TDpreProcessing': [{'function': 'seismic.correlate.preprocessing_td.detrend',
        'args': {'type': 'linear'}}, {'function': 'seismic.correlate.preprocessing_td.demean',
        'args': {}}, {'function': 'seismic.correlate.preprocessing_td.clip', 'args': {
          'std_factor': 4}}], 'FDpreProcessing': [{'function': 'seismic.correlate.preprocessing_fd.spectralWhitening',
        'args': {'joint_norm': false}}, {'function': 'seismic.correlate.preprocessing_fd.FDfilter',
        'args': {'flimit': [0.5, 1, 50, 60]}}], 'lengthToSave': 25, 'center_correlation': true,
    'normalize_correlation': true, 'combinations': []}
imme-w commented 3 days ago

Hi Peter, I think I found the source of the problem (not that I completely understand). I think it has something to do with the softlinks I am using (for the different location combinations). Because if I try to do the same run with the actual data directory instead of the directory with the softlinks it does work!

PeterMakus commented 1 day ago

Alright. This seems to be an obspy related issue then. If you have time, I would recommend you to write a bug report in the obspy repository.

The bug is located in the following function: lclient.get_waveforms