EIDA / userfeedback

This repository is meant to collect feedback from EIDA users by means of its Issue Tracker
11 stars 5 forks source link

AlpArray stations XML not matching the data #51

Closed PetrColinSky closed 4 years ago

PetrColinSky commented 4 years ago

Hi everybody, thanks for solving the issue#49 about missing XML. I got them OK and its good. I am working on the AA data and I have found many issues with XML entries not matching the data. Sometimes, its not clear, if the problem is in the metadata files or in the data files. I have checked 32 days of data from 2016-08-19 to 2019-07-06. For each station I investigated, what is the issue, so I do not attach here the whole table of 32 days by 40 stations. I can provide it on request. When I report the "end date", it means, that data is available for later times but XML does not list the epoch. Its not clear, if its just a wrong end date itslef, or if there is a whole epoch missing. Sometimes, the channels are labeled differently in the data and in the metadata. Sometimes, these two issues ar combined: there is the same channel labeling both in the data and metadata, but for different epochs. Here the list of the problems:

A035A | XML end date 2018-06-13, but data available for later times A104B | missing HHE channel in the xml A118A | end date 2016-11-08, but data later A140A | channel mismatch: xml ZNE, data Z23 A144A | channel mismatch: xml ZNE, data Z23; Z23 is alsso in xml, but with early end date of 2017-08-16 A145A | channel mismatch: xml ZNE, data Z23 A151A | end date 2018-03-23 but data later A156A | end date 2016-12-14 A161A | end date 2017-05-16 A164A | xml: Z12, data ZNE; xml end date 2016-09-16 A165A | end date 2017-05-15 A167A | end date 2017-05-15 A168A | end date 2017-05-30 A170A | end date 2017-05-29 A173A | end date 2018-10-24 A175A | end data 2017-05-29 A177A | end date 2017-09-20 A179A | end date 2017-05-16 A180A | end date 2018-11-15 A181A | end date 2017-07-20 A182A | end date 2017-05-17 A183A | end date 2017-05-17 A186A | end date 2016-11-03 A192A | end date 2017-03-24 A194A | end date 2017-02-08 A201A | xml: Z12, data ZNE; xml end date 2019-01-09 A204A | xml: Z12, data ZNE; xml end date 2017-12-07 A205A | end date 2018-11-28 A210A | end date 2017-12-06 A214A | end date 2017-02-17 A215A | end date 2016-01-22 A216A | end date 2017-12-19 A217A | end date 2016-12-06 FR.CORF | channels HHZ/N/E have end date 2017-11-09, but appear in the data later FR.OGCB | a gap between two epochs 2018-01-23T09:00:00 and 2018-01-23T13:00:00, but data is continuous, even just electronic noise and spikes, looks like there is no seismic record for the whole day IV.MAGO | channels HHZ/N/E have end date 2017-08-03, but appear in the data later

The XML were downloaded in the last month. The data were downloaded in the last 2 years. I am obviously aware of the fact, that some of the channels mismatch could be because I have an old copy of the data. If meanwhile the data files were updated on EIDA, maybe the issue is not an issue anymore. However, it was difficult to check it, as I am not able to download any fresh data (e.g. from the RESIF node) in the moment (means in the last week) (which is another issue).

These issues were found when checking the data for 32 selected earthquakes only, so, it is in no way a complete view of the problem. However, as many of the issues affect months and years of missing epochs in the XML, correcting these would significantly improve the quality over the AlpArray time period from 2016 to 2019. As I plan to add more earthquakes later to my study, more issues may pop up for another stations.

I have also found similar problems with stations A088B, PRA and VNDS, but I am already in contact with the operators, so I do not list it here. I will inform about the outcome of these, however.

Thanks Best regards Petr Kolínský, UniWien

javiquinte commented 4 years ago

Thanks for the feedback @PetrColinSky ! I'm contacting the people from RESIF and LMU, as I think these stations belong to them.

PetrColinSky commented 4 years ago

Thanks, @javiquinte for taking care of this. Meanwhile, we solved A088B with Luděk V (IG CAS). The issue concerned the latest data which are not in EIDA yet. On EIDA, everything is alright for A088B. Yes, the other stations are from LMU and RESIF. Inly MAGO is by INGV, I think.

jschaeff commented 4 years ago

I've forwarded your message to the appropriate persons in charge of the metadata of FR and Z3 in RESIF.

ChristopheMaron commented 4 years ago

Dear Petr,

About FR.CORF and FR.OGCB mentionned troubles ...

The gap in the OGCB metadata corresponds to changes of the sensor (STS-2 to Trillium-120PH) and data acquisition system (Taurus to Centaur). The end and begin dates have been given by the operator. I mean you can verify that by reading the OGCB metadata. The data after the mentionned end of FR.CORF are not valid : flat signal, data not correctly dated ... Thank you. Best regards, Christophe (in charge of FR broad band metadata)

PetrColinSky commented 4 years ago

Thanks, @ChristopheMaron. OK, I simply deleted the data files for times which are not included in the metadata. I will later post another issue I probably have with some of the FR stations, but I need to investigate it bit more. Thanks for now. Best regards petr

javiquinte commented 4 years ago

Hi @PetrColinSky I'll remind LMU and INGV in a couple of days if we still miss an answer. We are far from being in normal working days... 😷

massimo1962 commented 4 years ago

hi everybody, I don't understand what I have to do with this issue, sorry guys.

petrrr commented 4 years ago

Dear all, I have check the situation for IV.MAGO and this one is a bit tricky.

For this station the sampling rate has changed (100 to 250), but at first this change was not managed correctly because this change also requires channel naming as well. This one will require some data curation. We'll try to get on this ASAP, but it will not be immediate.

Bottom line:

The wrong data is presently not available, but there was probably a time window (before the fix) in which the wrongly labled data could have been accessed.

Sorry for this inconvenience!

PetrColinSky commented 4 years ago

@petrrr Thanks! The most important information for me is, what is right, OK, in case of MAGO metadata is correct, nice. I will then delete the data which do not fit whats in the metadata (similarly as I did for FR.OGCB and FR.CORF). Just curiosity: in the metadata, there are HHZ/N/E channels with 100 sps and then (since 2017-08-04) CHZ/N/E channels with 250 sps. All the data I have, including the ones which do not match these (October 2017 - January 2018) not only have the channel labeled still as HHZ/N/E, but the sampling is always 125 sps (which does not match any of the two epochs in the metadata). And, btw, the data looks visually very nice. I did not get any data since the end of January 2018. ObsPy does not complain when the sampling rate is different in data/metadata (I anyway deconvolve the transfer function from the data resampled to 10 sps).

@javiquinte Sure, no push, I am also working at home in kind of stand-by mode. Most of the AA temporary stations are the French ones, and LMU. For INGV, there was only the IV.MAGO, which is solved now, thanks. And as already discussed with Christian W. from Kiel, it looks like in case of the AA stations, the problem will be the opposite to FR and IV. Looks like some epochs are missing in the metadata of AA temporary. cheers and thanks again petr

petrrr commented 4 years ago

@petrrr Thanks! The most important information for me is, what is right, OK, in case of MAGO metadata is correct, nice. I will then delete the data which do not fit whats in the metadata (similarly as I did for FR.OGCB and FR.CORF).

Okay. We will try to may the missing data available ASAP.

Just curiosity: in the metadata, there are HHZ/N/E channels with 100 sps and then (since 2017-08-04) CHZ/N/E channels with 250 sps. All the data I have, including the ones which do not match these (October 2017 - January 2018) not only have the channel labeled still as HHZ/N/E, but the sampling is always 125 sps (which does not match any of the two epochs in the metadata). And, btw, the data looks visually very nice. I did not get any data since the end of January 2018. ObsPy does not complain when the sampling rate is different in data/metadata (I anyway deconvolve the transfer function from the data resampled to 10 sps).

Thanks for the hint, we'll check this better.

aschloem commented 4 years ago

Hey, I'm working on the AlpArray metadata problems... It takes a bit of time. Sorry.

aschloem commented 4 years ago

Hey,

A035A | XML end date 2018-06-13, but data available for later times => fixed

A104B | missing HHE channel in the xml => fixed, I added HHE

A118A | end date 2016-11-08, but data later => less data for A118A, since 2016-11-09 it changed to A118B

A14?A HH2, HH3 and HHN and HHE => There were problems converting it into the SDS-format. There should not be HH2 and HH3. I fixed it in our archive and the corresponding metadata.

You can download the new data for A140A, A144A and A145A (new HHN and HHE) and the corresponding new metadata from our server erde.geophysik.uni-muenchen.de.

Thank you for the hint. Please inform me, if there are further problems.

Regards Antje

javiquinte commented 4 years ago

Hi @PetrColinSky Please, let me know when you consider that this has been solved. Or give us a list of missing issues, so that we can track how far are we from closing this. Thanks!

PetrColinSky commented 4 years ago

@aschloem Thank you very much. I deleted all A118A files where data is available for A118B. Indeed, these were exactly the same files with the same data, just two times in my archive with both "A" and "B" labels. The metadata for all (other) 5 stations look good now. I also deleted all data for A140A, A144A and A145A and downloaded everything again. A140A looks nice. I only have an issue with A144A and A145A, where for these days 2017-01-22 2017-04-03 2017-07-17 2017-07-20 2017-09-08 2017-09-19 2017-10-10 2017-11-12 I only see the Z component in the data, no horizontals. While before, when it was still labeled wrong (Z23), I had also the horizontals, so I assume, they were recorded. It affects both the two stations in the same time period, probably continuously, but I only checked the days of "my" earthquakes. I downloaded these using RoutingClient(...).get_waveforms(...) in ObsPy. Do I do something wrong? But for other days (before and after those affected), I have all 3 components. I downloaded everything the same way using the same script.

@javiquinte We are still missing a reply about the French AlpArray temporary Z3 stations, which are these 27 stations: A151A A156A A161A A164A A165A A167A A168A A170A A173A A175A A177A A179A A180A A181A A182A A183A A186A A192A A194A A201A A204A A205A A210A A214A A215A A216A A217A The others are solved now (kind of MAGO also considered as solved, see @petrrr's answer above). Meanwhile, just to spread the information, looks like SL.VNDS also had some issue with the metadata, I checked this with Mladen Z. There are also some data where only digitizer noise was recorded. The same for CZ.PRA, where for some time in 2016 and 2017, there is a wrong network code in the metadata and incoherent location code in data/metadata. I hope Vladimir P. will also update this in EIDA. Cheers Petr

javiquinte commented 4 years ago

OK. Thanks for the summary @PetrColinSky Let's wait for the answer from @jschaeff

Nice weekend and stay healthy!

petrrr commented 4 years ago

This one just to let you know that the metadata and data for IV.MAGO, should have been fixed (along with few other stations with similar issues). I therefore removed the INGV tag, and hope that's okay (@javiquinte).

Thanks for letting us know! And please to not hesitate to report issues back to us.

#Network | Station | location | Channel | Latitude | Longitude | Elevation | Depth | Azimuth | Dip | SensorDescription | Scale | ScaleFreq | ScaleUnits | SampleRate | StartTime | EndTime
IV|MAGO||CHE|43.273245|10.657926|280|0|90|0|NANOMETRICS TRILLIUM-40S|1179650000|0.2|M/S|250|2018-04-11T07:20:21|
IV|MAGO||CHN|43.273245|10.657926|280|0|0|0|NANOMETRICS TRILLIUM-40S|1179650000|0.2|M/S|250|2018-04-11T07:20:21|
IV|MAGO||CHZ|43.273245|10.657926|280|0|0|-90|NANOMETRICS TRILLIUM-40S|1179650000|0.2|M/S|250|2018-04-11T07:20:21|
IV|MAGO||HHE|43.273245|10.657926|280|0|90|0|NANOMETRICS TRILLIUM-40S|1179650000|0.2|M/S|125|2017-01-11T07:20:21|2018-04-10T23:59:00
IV|MAGO||HHN|43.273245|10.657926|280|0|0|0|NANOMETRICS TRILLIUM-40S|1179650000|0.2|M/S|125|2017-01-11T07:20:21|2018-04-10T23:59:00
IV|MAGO||HHZ|43.273245|10.657926|280|0|0|-90|NANOMETRICS TRILLIUM-40S|1179650000|0.2|M/S|125|2017-01-11T07:20:21|2018-04-10T23:59:00
javiquinte commented 4 years ago

Hi @petrrr I just try to keep the original labels to know who took care of solving things after the issue is closed. So, I included again the three data centres related to this issue, but it's clear that for the time being only RESIF needs to provide a solution to the remaining problems.

This one just to let you know that the metadata and data for IV.MAGO, should have been fixed (along with few other stations with similar issues). I therefore removed the INGV tag, and hope that's okay (@javiquinte).

PetrColinSky commented 4 years ago

@petrrr Thanks! I will replace all my data files as well as xml to work with the latest version for IV.MAGO. I also use some more IV stations. Shall I replace them as well? Which ones? @javiquinte Yes, RESIF is pending, plus also that issue with missing horizontal components of A144A and A145A (LMU), see my reply to @aschloem above. cheers petr

jschaeff commented 4 years ago

Sorry for the delay. David Wolyniec (our data/metadata expert) is going to comment here soon.

ChristopheMaron commented 4 years ago

Dear all,

I already answered for the FR metadata (CORF and OGCB stations). I was the 18th of March, PetrColinSky replied it was OK.

Best Resgards,

Javier Quinteros notifications@github.com a écrit :

Hi @petrrr I just try to keep the original labels to know who took care of
solving things after the issue is closed. So, I included again the three data centres related to this issue,
but it's clear that for the time being only RESIF needs to provide a
solution to the remaining problems.

This one just to let you know that the metadata and data for
IV.MAGO, should have been fixed (along with few other stations
with similar issues). I therefore removed the INGV tag, and hope
that's okay (@javiquinte).

petrrr commented 4 years ago

@petrrr Thanks! I will replace all my data files as well as xml to work with the latest version for IV.MAGO. I also use some more IV stations. Shall I replace them as well? Which ones?

We did analog corrections to the following stations::

PetrColinSky commented 4 years ago

Thanks, @petrrr. Very valuable information, because all these three are in my list of stations as well.

PetrColinSky commented 4 years ago

After the discussion I had with Anne P. last week, I have a "solution" to my problem with the metadata, which, however, is not a solution of whats wrong in EIDA. If one downloads the xml directly from the RESIF portal as e.g. http://ws.resif.fr/fdsnws/station/1/query?network=Z3&station=A151A&channel=???&level=response The xml is perfectly fine, compelte, including all epochs of sensor/intrumentation changes. So, the issue is somewhere on EIDA. I downloaded xmls for all 27 AA temporary stations reported above and it works fine. I was also finally able to "refresh" my data files downloaded in 2017, which now really have updated channel names (before ZNE, now Z12), as I had assumed before. So, it all matches the metadata (stations A201A and A204A, see above). I only have an issue with station A164A, which is the same problem as I reported above for stations A144A and A145A - I only get the vertical component. My old copy of the A164A data has all 3 components, the new data has only vertical. I am downloading the data still from EIDA. Maybe, its the same problem. However, I cannot access the RESIF directly for the data, since its password-protected. Btw, I also downloaded the metadata for the permanent FR stations directly from RESIF portal, and it works nice as well. I report this in the appropriate issue #52. The technical problem was, however, different. While for the temporary stations, there were compelte epochs missing in the xml, for the permanent stations, there were only the NormalizationFactor entries wrong.

Thanks to @petrrr , I also refreshed my data+metadata files for the four IV stations mentioned above, and it works alright. cheers Petr

javiquinte commented 4 years ago

Great @PetrColinSky ! This is really helpful information. Could you please check the two sets of metadata I'm attaching here? I got one from the URL pointing to RESIF (the same you posted) and the other one from Obspy. The normal behaviour would be that both give the same results.

metadataA151A.zip

PetrColinSky commented 4 years ago

Dear @javiquinte, the two files you attached are the same (apart of the headers, which differ and which is unimportant). I checked it visually that the entries are the same, as well as I tested using them for removing the response and they produce exactly the same results. Moreover, I also checked the results obtained with the xml I got from RESIF few days ago, again everything is the same. I tested it for a data (2018-09-28), where my old xml crashed (missing the whole epoch). In addition, I also downloaded the xml again from EIDA today, meaning 1. April 2020, using the same ObsPy script as when I started the issue here 18 days ago, and I got a correct file with both epochs in there. My todays file from EIDA exactly matches yours yesterdays file from EIDA, only the "Created" time is different. Meaning, now it works, before (3 weeks ago) it did not. Have you changed something? Thanks! best regards Petr

javiquinte commented 4 years ago

Hi @PetrColinSky On our side, I (we) haven't changed anything. I found weird that a direct link to RESIF works and Obspy or other client doesn't. That's why I tried myself both ways to retrieve the metadata. Actually, smart clients going through the Routing Service must do the same request to RESIF as you posted. If there was a change it should have been in RESIF. Any hints @jschaeff ? In this case, it was a test with Z3.A151A so it should be easy to check if something changed.

petrrr commented 4 years ago

@PetrColinSky in comment https://github.com/EIDA/userfeedback/issues/51#issuecomment-606492750 you are writing:

[...] I am downloading the data still from EIDA. [...]

What exactly do you mean with that?

It might help to understand better what exactly is (was) going on. Just in case there is some hide problem.

petrrr commented 4 years ago

Thanks, @petrrr. Very valuable information, because all these three are in my list of stations as well.

@PetrColinSky we working on providing such information in a more structured way, be patient with us, this will take us some more effort. But you are welcome to contact us in case of doubts.

PetrColinSky commented 4 years ago

Hi @petrrr, sorry for being unclear. By "download from EIDA" I mean using ObsPy 1.1.1 routing client, see also my comment somewhere above. By "download directly" I mean typing the command shown above to URL pointing directly to a node (RESIF, or other).

Ad four IV stations: I updated all my data and metadata files for FROS, MAGO, MCIV and TRIF, I have both HH and CH channels. and everyting is working OK.

Ad A164A: The data is not yet updated according to Anne P., so, this is, why I did not get the horizontals with proper labeling. cheers petr

javiquinte commented 4 years ago

From Antje Schlömmer (LMU)

Hey Petr, Hey Javier, I renamed the channels HH2 and HH3 (A144A, A145A and A145A) to HHN and HHE, (2017) but: According to the "Standards for seismic stations and data management" (http://www.alparray.ethz.ch/en/seismic_network/backbone/standards-for-seismic-stations-and-data-management/) some of our (German) institutions named the some channels of their stations "HH2" und "HH3" in the past (first installation without a gyrocompass). But this proceeding was inconsistent: Some epochs had HH2/HH3 with a given azimuth of the horizontal components (sometimes > 5°) and some epochs had HHN/HHE with the same azimuth (>5°) (petr, you mentioned that). I don't know what happened in the past, but it think it make no sense to have HH2 and HHN with the same azimuth?! Other institutions named their channels HHN and HHE at the beginning of the installation and I added the azimuths to the metadata after checking the real orientation with our gyrocompass in 2018. Sometimes it is greater than 5°. I removed HH2 and HH3 from our archive, but now we have HHN and HHE with azimuths greater than 5°(e.g. A144A, A145A,...). So everybody should be aware of it and rotate the stations. What do you think?

javiquinte commented 4 years ago

From @PetrColinSky

Dear Antje, thanks for the email. Several remarks to this: -The most important thing is (at least for me), that the data and metadata contains consistent information. This was the main problem why I started the issue #51. -I agree, that having epochs HH2 and HHN with the same azimuth does not make much sense. Even technically, it does not matter, if the labels are the same also in the data. However, from the point of view of the standards, it should be clearly just one of the two options, depending on the misorientation (> or < 5 deg). -My comment about the "combined" issue was about that I see both HH(ZNE) and HH(Z23) labels in the data, and also both HH(ZNE) and HH(Z23) labels in the metadata, but the epochs do not match, meaning, for times, where data are labeled using ZNE, the metadata say Z23 or the opposite. Its probably true, as you say, that in addition to this, the azimuth is the same. -After the discussion with Anne Paul, I got the point, what they did in France: they FIRST labeled everything ZNE thinking, that the orientation was correct (< 5 deg). Later, they found out, that the orientation was off more than 5 deg, and they backward re-labeled the already recorded data to Z23. The problem I had was, that I had downloaded the data in the moment, when they were still labeled az ZNE. Later, I downloaded the refreshed metadata, where only Z23 labels were used. Solution to this was to AGAIN download the fresh version of the data. -I though, that in Germany, the problem was the same. From what you say now, it looks bit different, because in Germany, some institutions named the channels Z23 already in the first moment not knowing the orientation bud probably suspecting, that it might be off more than 5 deg. -I anyway rotate the records ALWAYS, regardles of the channel names. Simply, if there is non-zero entry in the property, I use it. So, I don't care about how the channels are named. However, I am only a user. I cannot really comment on how to label it properly. At UniWien, we tried to stick to the standars: if its more than 5 deg off, its Z23. If its less, its ZNE. But the is always entered, even its less than 5. It was done by Florian Fuchs, not me. -Yes, there is a danger, that if one sees HH(ZNE), he does not rotate it thinking, that the misorientation is less than 5 deg. However, frankly, there are many stations (some Italian), where the entry in the property is simply zero, even the sensors are probably (and for some I know it for sure) misoriented. For me, the discussion about the naming convention is a secondary problem - the primary is to have the correct information (misorientation). -Right now, I again tried to download the data for A144A and A145A, for which I only got the Z component on 19.03.2020., see my comment below your email. For A145A, everything looks good now. For A144A, I got some good files, but still, for the four days 2017-01-22 2017-04-03 2017-07-17 2017-07-20 I am only getting the Z component and no horizontals. Do you have any idea, whats wrong?

jschaeff commented 4 years ago

@javiquinte sorry, I can't catch up this issue. Is there something I can do ?

javiquinte commented 4 years ago

@jschaeff Everything works now and we are just wondering whether you changed something in RESIF in the last days. The test was with Z3.A151A.

PetrColinSky commented 4 years ago

Hi, @javiquinte and @aschloem, today, I tried to again download A144A data for the four days mentioned above, and I got all 3 components of the data. It had the same behaviour like for A145A before - at some point, only vertical was available, later all 3C were available. So, its still bit of a mystery, but I have now all I need for A144A and A145A. So, for LMU, this issue can be closed. thanks and cheers petr

javiquinte commented 4 years ago

Hi @PetrColinSky Is there anything else missing? LMU, RESIF, and INGV seem to work now. Am I right?

PetrColinSky commented 4 years ago

Yes, you are right, @javiquinte. Everything I have asked for in this issue works now. Even some mystery is left in the process how some of the xml were suddently repaired by themselves... :-) Thanks petr