simpeg / aurora

software for processing natural source electromagnetic data
MIT License
14 stars 2 forks source link

Coordinate Mismatches (Yellowstone) #228

Open kkappler opened 1 year ago

kkappler commented 1 year ago

When merging data from local and remote station, underlying coordinates seem to differ slightly in the case of Stations WYYS3 and MTC20. The error was in, specifically when merging the reference station data into the local station.


merged_xr = X.merge(Y, join="exact")
merged_xr = merged_xr.merge(RR, join="exact")


merged_xr = X.merge(Y, join="exact")
        merged_xr = merged_xr.merge(RR, join="exact")
    except ValueError:
        print("Coordinate alignment mismatch")
        merged_xr = merged_xr.merge(RR, join="left")
        for ch in list(RR.keys()):
            merged_xr[ch].values = RR[ch].values

I could not detect the difference in the coordinates, but something must have been different. The error that was being thrown was occurring :

ValueError: cannot align objects with join='exact' where index/labels/sizes are not equal along these coordinates (dimensions): 'observation' ('observation',), 'frequency' ('observation',), 'time' ('observation',)
kkappler commented 1 year ago

This may actually have been due to using Single Station Processing when a Remote Reference station is specified in the TFKernelDataset.

To test this hypothesis, I modified synthetic process_synthetic_rr12() so that the RME engine was called, rather than RME_RR, but the error did not reproduce.

Ditto for

However, when I run the Yellowstone dataset (with RR station, with RME_RR engine) the error does not reproduce.

This is pretty weird.

Deeper inspection shows that while :


returns True


returns False

In fact, the timestamps differ by a nanosecond:

>>> RR.time[0].values

But then why would I not get the same error in RR mode??

The confusion was coming from running in two different environments, and the answer seems to be:

The error occurs in py3.8 with xarray== '2022.6.0' but NOT in py3.7 with xarray=='0.16.2'

Interestingly, the h5 file is the same in both versions, which makes me wonder if the new version of xarray is somehow "stricter"?, and wouldn't tolerate a ns offset, but the older version would allow such a mismatch??

I am going to run the processing in py37 via debugger and check if there is a ns mismatch or not.

aaaaand, the error is now showing up in py37. The only place where I do not get the error is in a jupyter notebook that has been alive for a while, and when I refresh notebook-kernel the error is there.

So, I cannot reproduce the successful runs (done on Friday 14 October) but I can reproduce the errors.

Alas, I will add the workaround from the start of this ticket, BUT this ticket is not closed.

What we need to do is handle non-simultaneity in general. This can be managed in TFKernel's validation methods, such as those mentioned in #103