Closed kkappler closed 1 year ago
In doing this there are several other tasks to address. Now that the TF class is behaving itself we should:
It made sense to address issue #134 at this point, so that is done as well
@kujaku11 : I made good progress on this but I will need a quick tour of mt_metadata/transfer_functions/io
,
in particular the file mt_metadata/mt_metadata/transfer_functions/io/zfiles/zmm.py
.
We may need to blend in some code from the write_emtf_z_file
method in aurora/transfer_function/transfer_function_collection.py
.
Currently .zss
, .zrr
are not supported in mt_metadata.
Executing:
tf_cls.write_tf_file("testzss.zss", file_type="zss")
or
tf_cls.write_tf_file("testzss.zss", file_type="zmm")
yields:
KeyError: "Writer for the file type zss does not currently exist. Current writers ['edi', 'zmm', 'j', 'emtfxml', 'avg']"
Also, I noticed that when I tested I got a complaint that the mth5 file was closed, but I'm not sure why access to the mth5 should be needed for a populated tf to write to file, it is not needed for .xml output.
To test I dropped a breakpoint at the end of process_mth5
method, right before returning tf_cls
, in process_mth5.py
I'll create an issue about the plugins keys in mt_metadata
Revisiting this again; Current status: all files {.zss, .zrr, .zmm} are capable of being written. The files do not contain data however. Attached is an example zrr file (saved with .txt extension for uploadability) synthetic_test1.txt
I created a python executable from aurora/tutorials/synthetic_data_processing.ipynb
and added a line to save the tf as .zrr. I put it in aurora/tests/io/synthetic_data_processing.py.
Following the logic, at line 214 of mt_metadata/transfer_functions/io/readwrite.py, the object file_writer is the following function: <function mt_metadata.transfer_functions.io.zfiles.zmm.write_zmm(tf_object, fn=None)>
Stepping into that function lands us (as expected) in: mt_metadata.transfer_functions.io.zfiles.zmm.write_zmm.py at line 1011.
A ZMM object is created from the tf dataset, and some conventions regarding the channel ordering and nomenclatures are regarded in lines 1019-1031. Then, in lines 1032 - 1055 some wrangling of run information is handled. It is not clear to me what is happening there, maybe add a comment about the purpose of that code block?
Finally, on line 1059 we have zmm_obj.write(fn)
Here are some things that should be sorted out:
ZMM has an attr: self.decimation_dict = {}
, A KeyError
exception is handled when the decimation_dict is referenced. In order to fix this, one would need to know what the intended structure of decimation_dict
is? The attr appears to be populated in the method _read_period_block
around line 604.
self.decimation_dict[f"{period:.10g}"] = {
"level": level,
"bands": bands,
"npts": npts,
"df": sr,
}
So it is keyed by a string representation of the period, and has info about that periods decimation level, band, number of points and sample rate. When I inspect in the debugger: self.decimation_dict
I get an empty {}
. So this is not being populated correctly.
The header produced is:
TRANSFER FUNCTIONS IN MEASUREMENT COORDINATES ***** WITH FULL ERROR COVARIANCE ****
test1 coordinate 0.000 0.000 declination 0.00 number of channels 5 number of frequencies 25 orientations and tilts of each channel
But the expected header is:
IMPEDANCE IN MEASUREMENT COORDINATES ** WITH FULL ERROR COVARINCE** Robust Remote Reference
station :test2
coordinate 0.0 0.0 declination 0.0 number of channels 5 number of frequencies 25 orientations and tilts of each channel 1 0.00 0.00 tes Hx 2 90.00 0.00 tes Hy 3 0.00 0.00 tes Ex 4 90.00 0.00 tes Ey 5 0.00 0.00 tes Hz
self.output_channels
and self.input_channels
, however, these attributes both yield empty lists, so they are not being set correctly.The input and output channels are taken from self.channels_recorded
, which is also empty. This should not be the case.
The problem may not lie in zmm.py, since the tf_object.station_metadata.channels_recorded is also an empty list.
The ZMM init code (around lines 1015-1017) is:
zmm_obj = ZMM()
zmm_obj.dataset = tf_object.dataset
zmm_obj.station_metadata = tf_object.station_metadata
And at that point, we are supposed to be able to tell if there is a tipper or not, i.e. channels recorded should be defined.
So to debug, we can try getting tf_object.station_metadata.channels_recorded
set to its proper values and see if that helps. One cannot hackaround such a test however by setting:
tf_object.station_metadata.channels_recorded = ["ex", "ey", "hx", "hy", "hz"]
as tf_object.station_metadata
is an object of type mt_metadata.transfer_functions.tf.station.Station
which does not support explicit assignment. This is fine and probably for the best because it will force me to make sure that the TF object has the proper assignments first.
So, ...
After all that, it looks like this issue is blocked by:
mt_metadata.transfer_functions.tf.station.Station does not have the correct channels_recorded
information.
This means that the parent data structure which is a mt_metadata.transfer_functions.core.TF
was not properly assigned it's values (it is depraved!) and so we should look there before continuing on the details of the ZMM class.
The assignment of the mt_metadata.transfer_functions.core.TF object is done in aurora
, specifically in the function export_tf
in aurora/pipelines/process_mth5.py
Chasing this down a little further, the export_tf()
method takes a kwarg called station_metadata_dict.
In the specific case I am debugging (synthetic test data) station_metadata_dict
looks like this:
{'station': OrderedDict([('acquired_by.name', None), ('channels_recorded', ['ex', 'ey', 'hx', 'hy', 'hz']), ('data_type', 'BBMT'), ('geographic_name', None), ('hdf5_reference', '
'), ('id', 'test1'), ('location.declination.model', 'WMM'), ('location.declination.value', 0.0), ('location.elevation', 0.0), ('location.latitude', 0.0), ('location.longitude', 0.0), ('mth5_type', 'Station'), ('orientation.method', None), ('orientation.reference_frame', 'geographic'), ('provenance.creation_time', '1980-01-01T00:00:00+00:00'), ('provenance.software.author', None), ('provenance.software.name', None), ('provenance.software.version', None), ('provenance.submitter.email', None), ('provenance.submitter.organization', None), ('release_license', 'CC0-1.0'), ('run_list', ['001']), ('time_period.end', '1980-01-01T00:00:00+00:00'), ('time_period.start', '1980-01-01T00:00:00+00:00')])}
Note that channels_recorded
is a non-empty list, having the expected values.
So the issue is that the channel names, which are available are not getting assigned, which points squarely at line 181:
tf_cls.station_metadata.from_dict(station_metadata_dict)
That takes us into the assignment loop around line 645 of mt_metadata/base/metadata.py in the class Base(), into the core! N.B. This loop successfully assigns all the string variables, such as ('id', 'test1'), so perhaps the issue is that the value in the key-value pair ('channels_recorded', ['ex', 'ey', 'hx', 'hy', 'hz']), is a list? No, that is not it ...
Debugging further from here would benefit from some help from @kujaku11. Essentially, I land in mt_metadata/transfer_functions/tf/station.py at the
@channels_recorded.setter
def channels_recorded(self, value):
if not isinstance(value, (list, tuple)):
raise ValueError("Input channels must be a list")
self._recorded_channels = value
method. I cannot seem to assign any value at all to self._recorded_channels, and the getter method above this for channels_recorded does not seem to care what gets set here anyhow, it loops over runs and rebuilds the _recorded_channels
variable at each call.
Two questions then:
super().__setattr__(name, value)
, which seems to have no effect.After a debugging session on 26 May, the zmm writer seems to be working on the fcs
branch of mt_metadata
Now working on fc branch of mth5.
the test is in aurora/tests/io/test_issue_139.py
Before closing:
Information about decimation levels is lost in aurora's _merge_decimation_levels
method in transfer_function_collection.py
The processs_mth5 flow operates over each decimation level iteratively, packing a variant of Egbert's TTFZ
transfer function class into a dictionary, keyed by decimation level
tf_dict[0] --> TTFZ with decimation level 0 info tf_dict[1] --> TTFZ with decimation level 1 info ... tf_dict[N] --> TTFZ with decimation level N info
The tf_dict is passed to
TransferFunctionCollection` a convenience class in aurora with methods for merging the dict and exporting.
I tried to pack the merged xarray with decimation level info, but the final data product is sparsely populated because most frequencies exist at only one decimation level. I don't think we want this sparse, bloated xarray floating around.
For posterity, the code that I tried to use to make the multi-indexed data structure in tranfer_function_collection
_merge_decimation_levels
method is pasted below:
tmp_df = [self.tf_dict[i].tf.to_dataframe("transfer_function") for i in range(n_dec)]
tmp_df = [pd.concat([tmp_df[i]], keys=[f"{i}"], names=["decimation_level"]) for i in range(n_dec)]
# To recover the xarray from tmp_df[i], do this:
#tmp_df[0].to_xarray().transfer_function.drop("decimation_level")
tmp_df = pd.concat(tmp_df)
# But if we cast this back to an xarray it is very sparse, having dimension (4,3,2,25) in this case.
The solution I went with was instead to pass `decimation_dict` to the TF object. This makes test_issue_139.py output a correctly formatted z-file with the right metadata in each block from the TF.
fourier_coefficients
branch is not not calling write_emtf_z_file
, instead it uses mt_metadata's TF
class for writing.
The leading-zero scientific notation was added as a PR to mt_metadata (branch fortran_format)
Use the method in mt_metadata.tf instead.
-The code in mt_metadata was taken from transfer_function_collection so the outputs should agree. -The code shouldn't be in two places. -The ability to write emtf z-files has application even if the TF wasn't generated by the aurora pipeline
Therefore the place to remove the code is in transfer_function_collection and not in mt_metadata.
There are only 3 references to this method: