simpeg / aurora

software for processing natural source electromagnetic data
MIT License
13 stars 2 forks source link

Support for FC Level in MTH5 #278

Open kkappler opened 1 year ago

kkappler commented 1 year ago

As of July 2023, there is a branch of mth5 that supports archiving spectrogram data, or the so called Fouier coefficients (FCs) aka the Short-Time-Fourier-Transform data.

Testing of these archiving and retrieval capabilities is underway, but for this to be practically useful, we also must be able to generate Transfer Functions from the FCs.

The FC layer is currently not required by aurora, but it may be in the future. It is expected to provide the following advantages:

Here are some considerations in using the FC levels:

  1. How can we ensure that the stored FCs are consistent with what the user has requested in their processing config? Methods have been added to TF Kernel to check that the "FC recipe" from processing config is reflected by the stored FCs
  2. How to ensure stored FCs between various runs at various stations are compatible? In general this cannot be guaranteed. Using a standardized set of windowing parmameters will help. If the stored FCs are not compatible with what the processing config requests, then new FCs must be generaeted.

In general, there are a lot of things that can go wrong if assumptions are made about the stored FCs, these concerns were never an issue with direct processing because one config dictated everything about the process moving forward.

To avoid these concerns, what if … We keep the existing processing class to drive the TF generation, but enable checks for existing FCs and use these when available, or a flag like use_existing_fc is set to True? This would then have no impact on the existing TF computation, but just change the source of the input data from "compute-on-the-fly" to using the stored FCs.

To implement a strategy like this, the following will need to be taken into consideration:

The key change in logic is that the loop that creates the FCs will first check if they exist … if they do, there is no operations needed on TimeSeries and it will simply bypass the calculation and load them in-place.

Consider the cases user wants to:

kkappler commented 1 year ago

Technical note: Inside process_mth5, when adding FCs, there is a need to do: fc_decimation_level = fc_group.add_decimation_level(f"{i_dec_level}") but the fc_decimation_level above initializes to default values, as prescribed by FC standards in mt_metadata. It therefore does not contain the specific STFT parameters from the procesing config, which is the recipe that was used to make the stft_obj.

A preferable way to initialize is: fc_decimation_level = fc_group.add_decimation_level(f"{i_dec_level}",decimation_level_metadata=dec_level_config)

That way, fc_decimation_level will get its attributes from the config that was used to make the STFTs. This does not work out of the box however, because the dec_level_config and fc_decimation_level are different data structures ... (It seems there is little to lose (except time) and much to gain by making the dec_level_config and the fc_decimation_level as close to one another as possible).

If we set the kwarg decimation_level_metadata=dec_level_config an AttributeError is encountered. This occurs around line 296 of mth5.groups.base return_obj.metadata = group_metadata in the above line, group_metadata is the dec_level_config.

A solution would be a a reformatter that populates an FCDecimation metadata object with the info from the dec_level config.

One way to do this would be to initialize a dummy layer, to get the skeleton, and then fill it in Function can be called: fc_decimation_level_from_processing_config_decimation_level()