Closed June-Skeeter closed 1 month ago
@znesic
I'll look into it tomorrow (I am done for today).
@June-Skeeter: The problem was that 2024/BB/Flux/clean_tv trace was corrupted. That happened earlier this year when somehow the data from one of the EP summary files got messed up. This in turn caused the database trace TimeVector to be bad. The TimeVector then kept corrupting clean_tv. For a while I kept replacing clean_tv with a good one from Met folder but it kept getting overwritten by bad data. It took me a bit of time to trace it back to a bad TimeVector.
It's fixed now and it should all work. I am working now on fixing db_struct2database function (main database conversion routine) that should not be touching TimeVector and clean_tv at all after the first time the database for that year had been created. Those files should be changing only if an "incremental" database is used (forceFullDB = 0). I use that option only when dealing with "sparse" databases (like the ones for the manual chamber measurements).
So, it wasn't read_db, it was just a case of shit-in-shit-out. :-(
I was playing around with comparing some outputs from automated EddyPro runs to the existing flux data. I stumbled into a bug that seems to be in read_db or one of the functions it calls.
Anyway I've created a branch that contains a script to recreate the problem here along with this plot showing PAR and H traces a few days of data in 2023 and 2024 (original is from the Flux folder, reprocessed is from the epAutoRun_TestRun folder):
What is perhaps more confusing/concerning:
It could be some combination of issue duplicates values in the time vector and/or with timezones and the leap year? If you look at the raw time vector, you'll see there are some duplicated values:
So to summarize 1) there is something wrong with the 2024/BB/Flux data. This can be fixed using the recalculated BB data but is still worth thinking about whats wrong? 2) the read_db function seems to propagate timestamp issues from one year to another? It might be worth trying to put some guard rails in here to check for these sorts of issues. If nothing else, to at least warn the user that the timestamps aren't at the expected intervals.