If a station is listed in old_name .values() (names in brackets in old_name) then it is not processed by join_l4 (because appended to another AWS data). If a station is not in old_name.keys() then there's no historical data that needs to be appended and it is copied, as-is to the level_4 folder.
For the historical GC-Net stations, the aliases for variables are defined in an external file src/pypromice/process/variable_aliases_GC-Net.csv also defined as package data.
Right now, because of the parallel call to join_l4, join_l4 cannot know that it needs to re-append a given site (e.g. CEN) if the older station data (e.g. CEN1) is updated but not the latest station (e.g. CEN2).
In a level_4 folder, having one merged record for each site, combining historical, v2 and v3 stations as well as moved stations (e.g. THU_U replaced by THU_U2). Ongoing implementation in https://github.com/GEUS-Glaciology-and-Climate/pypromice/blob/join_l4/src/pypromice/process/join_l4.py with some updates in other files (https://github.com/GEUS-Glaciology-and-Climate/pypromice/compare/main...join_l4).
It uses is a list of the latest stations (as keys) and old stations in reverse chronological order: https://github.com/GEUS-Glaciology-and-Climate/pypromice/blob/97eaedb6a1d89f6ab62ce20a30287c4ae7eb1393/src/pypromice/process/join_l4.py#L12-L35 At the moment join_l4 is called on the same list of stations as join_l3, meaning sites for which new transmission, new raw files or new flags have recently been added: https://github.com/GEUS-Glaciology-and-Climate/aws-operational-processing/blob/b0d52ecf9427b204460f21f110ef0e049d0c49c4/l3_processor.sh#L173-L185
If a station is listed in
old_name .values()
(names in brackets inold_name
) then it is not processed byjoin_l4
(because appended to another AWS data). If a station is not inold_name.keys()
then there's no historical data that needs to be appended and it is copied, as-is to thelevel_4
folder.For the historical GC-Net stations, the aliases for variables are defined in an external file
src/pypromice/process/variable_aliases_GC-Net.csv
also defined as package data.The merging is done by time slices: https://github.com/GEUS-Glaciology-and-Climate/pypromice/blob/97eaedb6a1d89f6ab62ce20a30287c4ae7eb1393/src/pypromice/process/join_l4.py#L229-L232 where
ds1
is the current AWS data andds2
is the historical AWS data being appended before the start of ds1. Gap-filling during the overlapping period is currently not implemented.The result are files of identical format and same variables as the level_3 files.
Instead of
stid
there is now asite_id
andlist_station_id
attributes defined as: https://github.com/GEUS-Glaciology-and-Climate/pypromice/blob/97eaedb6a1d89f6ab62ce20a30287c4ae7eb1393/src/pypromice/process/join_l4.py#L271-L278 meaning that we drop the thev3
and the2
inCEN2
(and potentially other stations)Right now, because of the parallel call to
join_l4
,join_l4
cannot know that it needs to re-append a given site (e.g. CEN) if the older station data (e.g. CEN1) is updated but not the latest station (e.g. CEN2).