Closed claudiodsf closed 3 months ago
Claudio,
I have an option to write the processed traces and the spectra (as "auxiliary" data) to an ASDF file (which is based on HDF5). This will need to be revised, but I should not otherwise be affected. I will test this tomorrow or beginning of next week.
Actually, it would be nice if obspy itself would support writing to HDF5, including the header information. None of the currently supported formats support preserving the metadata.
I have an option to write the processed traces and the spectra (as "auxiliary" data) to an ASDF file (which is based on HDF5). This will need to be revised, but I should not otherwise be affected. I will test this tomorrow or beginning of next week.
Thanks Kris!
Actually, it would be nice if obspy itself would support writing to HDF5, including the header information. None of the currently supported formats support preserving the metadata.
ObsPy Stream().write()
and Trace().write()
have a PICKLE
option which should preserve everything. But that comes with all the limitations of pickle
, namely dependency on the library versions.
A more portable HDF5
format (lighter than ASDF and built-in) would be definitively nice. Maybe you can take some inspiration from this work and propose a PR to ObsPy 😉?
After a clean switch to the new_spectrum_class
branch, I run into this error:
File "C:\Users\kris\Documents\Python\cloned_repos\sourcespec\sourcespec\ssp_build_spectra.py", line 740, in build_spectra
weight_st = _build_weight_spectral_stream(config, spec_st, specnoise_st)
File "C:\Users\kris\Documents\Python\cloned_repos\sourcespec\sourcespec\ssp_build_spectra.py", line 520, in _build_weight_spectral_stream
weight = _build_weight_from_inv_frequency(spec_h)
File "C:\Users\kris\Documents\Python\cloned_repos\sourcespec\sourcespec\ssp_build_spectra.py", line 443, in _build_weight_from_inv_frequency
snr_fmin = getattr(spec.stats, 'spectral_snratio_fmin', None)
File "C:\Users\kris\Documents\Python\cloned_repos\sourcespec\sourcespec\spectrum.py", line 54, in __getattr__
return self[name]
KeyError: 'spectral_snratio_fmin'
This is very strange, as getattr
should not throw an error if a default value (in this case None
) is provided...
No hurry, as my weekend starts in a few minutes!
No hurry, as my weekend starts in a few minutes!
Will look into that on Tuesday!
Happy Easter! 🐣
I investigated this problem a bit further. It happens for traces with no or too short noise window:
GR.GRA1..BH: No available noise window: a uniform weight will be applied
In that case, the spectrum header fields spectral_snratio_fmin
and spectral_snratio_fmax
are not set in the _check_spectral_sn_ratio
function, and due to the way the __getattr__
method of AttribDict
is implemented, getattr
fails even if a default value is specified. The AttribDict
in obspy does work with a default value.
The solution is to either set spectral_snratio_fmin
and spectral_snratio_fmax
also if there is no noise window, or perhaps revert to using obspy's AttribDict
.
Happy Easter to you as well!
A suggestion for writing to HDF5:
I investigated this problem a bit further. It happens for traces with no or too short noise window:
GR.GRA1..BH: No available noise window: a uniform weight will be applied
In that case, the spectrum header fields
spectral_snratio_fmin
andspectral_snratio_fmax
are not set in the_check_spectral_sn_ratio
function, and due to the way the__getattr__
method ofAttribDict
is implemented,getattr
fails even if a default value is specified. TheAttribDict
in obspy does work with a default value.The solution is to either set
spectral_snratio_fmin
andspectral_snratio_fmax
also if there is no noise window, or perhaps revert to using obspy'sAttribDict
.Happy Easter to you as well!
Hello, this should be fixed with the latest commit.
I prefer to use my implementation of AttributeDict
because I would like the Spectrum()
class to be independent of ObsPy (except for the Spectrum().from_obspy_trace()
method, which lazy-imports obspy.core.trace.Trace
).
Maybe, in the future, I'll use this AttributeDict
class all along the code 😉
A suggestion for writing to HDF5:
- maybe all spectra could be grouped in the group 'Spectra'
You mean something like:
HDF5_file
└──Spectra
├──spectrum_0000
├──spectrum_0001
?
What would be the advantage to have this top-level group? Would there be any other group than Spectra
in the spectrum HDF5 file?
- maybe name the spectra by nslc-code (network-station-location-channel) instead of just numbering them ?
The numbering is meant to reflect the order in the SpectrumStream
object, which behaves like a list: Spectrum
objects are accessed using an index.
In your example, what should happen if two spectra have the same nslc-code?
Force-pushed a better solution. Please verify again 😉 🙏
Claudio,
The problem with the getattr
call in the _check_spectral_sn_ratio
function is solved now.
However, there is another problem. Actually, the traces that triggered the first problem because there was no noise window, do have a noise window. When I run them with the main branch and a fixed window length, the trace plot looks like this:
When I switch to the new_spectral_class
branch without any modification in my script, the trace plot is very different:
Apparently, something goes wrong with the window definitions.
Concerning the suggestions for the HDF5 format:
These are just suggestions, based on my experience with ASDF files. It's of course your decision, but it could promote this HDF5 format as a more general format for seismic data...
That's nasty! The time window has totally changed!
Let's try something first: I'll rebase this branch on main
so that it will include the latest modificaitons on time window.
Rebased and force-pushed.
Could you please force-pull and try again?
The problem is gone after force-pushing. I still get empty noise windows for some stations, but that was also the case with the main branch, so nothing to do with the new spectrum class. I will investigate this later.
These are just suggestions, based on my experience with ASDF files. It's of course your decision, but it could promote this HDF5 format as a more general format for seismic data...
I'm convinced! 👍
See the last commit, where I implemented your idea, with the only difference of using spectrum_NNNNN_NET.STA.LOC.CHAN
for keys 😉
@krisvanneste, ok for you to merge?
Claudio,
I still have a question. I ran my test case again with the option to save the spectra. I then read the saved spectra and compared them to those I collect in my wrapper function (and which come directly from the build_spectra
function. Strangely, they are not the same...
This is the spectrum for station GR.GRA1 written to the HDF5 file:
Spectrum GR.GRA1..BHH | 225 samples, 0.23-5.00 Hz | 0.02 Hz sample interval | 68 samples logspaced, 0.23-5.12 Hz | 0.02 log10([Hz]) sample interval logspaced
This is the one I collect in my wrapper function:
Spectrum GR.GRA1..BHH | 225 samples, 0.23-5.00 Hz | 0.02 Hz sample interval | 68 samples logspaced, 0.23-5.12 Hz | 0.02 log10([Hz]) sample interval logspaced
And this is what it looks like in the sourcespec plot:
I have the impression that the saved spectra have not been scaled yet. Is this the intention?
Very strange!
Spectra are written after all the manipulations and after the inversion (see here).
Looking at your figure, I have the impression that you read the .residuals.hdf5
file (and not the .spectra.hdf5
file).
Is that possible?
Claudio,
Yes, sorry, a question I forgot to ask was: why is the output file called .residuals.hdf5
?
That explains it then. However, I do not see a .spectra.hdf5
file in the output folder...
OK, got it sorted now. I had to add a line to save the spectra in my wrapper function. I see the HDF5 file contains both the actual spectra and the fitted spectra.
Sorry for late reply.
Yes, there is a new config option called save_spectra
😉
Yes, but I also had to call the save_spectra function in my wrapper.
So, good to merge? 🙃
Yes, go ahead!
Merged!!! 🚀
Thanks Kris for the review and the suggestions 🙏
For an ongoing project, I need to save spectra to the output directory.
This motivated me to tackle the technical debt which was accumulating in the
Spectrum()
class.This is the first class I wrote 12 years ago (!) and at that time I decided that a quick and dirty –yet functional– solution was to make it inherit from ObsPy
Trace()
object. One of the limits of this approach was that there wasn't any easy option (except of usingpickle
) to save spectra to disk.So, I rewrote the
Spectrum()
class from scratch, while trying to keep the same interface of the previous one, with one difference:Spectrum().get_freq()
does not exist anymore: it has been replaced by the class attributeSpectrum().freq
. I chose indeed to store all the frequencies, and not only the starting frequency and the frequency delta. In this way, we have the same interface forSpectrum().freq
andSpectrum().freq_logspaced
.The main improvements are the following:
The new
Spectrum()
class, of course 😄A new
SpectrumStream()
class, similar to ObsPyStream()
, to storeSpectrum()
objects. (Note that ObsPyStream()
will no longer acceptSpectrum()
objects, since they do not inherit anymore fromTrace()
).Both
Spectrum()
andSpectrumStream()
have a.write()
method to save the spectra to disk inHDF5
orTEXT
format. Both formats use YAML for serializing the metadata inSpectrum().stats
.The new
HDF5
format is used for storing spectral residuals (replacing thepickle
format).A new config option
save_spectra
(default:False
) is introduced, to save spectra to the output dir.Finally, I added the function
read_spectra()
(similar to ObsPyread()
) which will read anHDF5
orTEXT
file and return aSpectrumStream()
with one or moreSpectrum()
objects.I thoroughly tested these new classes with my default examples and with many other events: the results are consistent, with some small (~1%) differences arising sometimes, probably because of the different way frequencies are stored.
However, before merging this PR, I'm calling @krisvanneste and @rcabdia to kindly check that the new code does not introduce any regression in your use cases, since you are both using the SourceSpec API.
Thanks! ☺️ Claudio