Open mdsumner opened 6 years ago
Just a link to early experiment so I don't forget https://github.com/mdsumner/rargo
New source:
ftp.ifremer.fr/argo/ifremer/argo/dac
argo_handler
"ftp.ifremer.fr/argo/ifremer/argo/argos_merge-profile_index.txt.gz" - ar_greylist.txt are marked as "problems"ping @KimBaldry @raymondben
@mdsumner run in progress now: files in /rdsi/PUBLIC/raad/data/www.usgodae.org/ftp/outgoing/argo/dac/ Note that I'm retrieving from the US global DAC rather than ifremer, seems faster (but as far as I can tell the data content is the same). Also am not yet doing anything with the greylist.
Thanks Ben! I wouldn't worry about the greylist. At the moment the list is highly subjective as program QC procedures are being developed
Here's a quick look at the 15241 starting points for the files, I don't quite get the reference date yet but will figure it out:
I put out a new function to find the files on our system (but I can't yet update the server just yet):
https://github.com/AustralianAntarcticDivision/raadfiles/blob/master/R/raad-argo-files.R
I see ~3Gb of profile files.
@KimBaldry tidync now has support for these Argo files, we should look at the details but this kind of workflow should be available now on the server
read_prof <- function(x, grid = "", ...) {
tidync::tidync(x) %>% tidync::hyper_tibble(...)
}
library(raadfiles)
#> global option 'raadfiles.data.roots' set:
#> '/rdsi/PRIVATE/raad/data
#> /rdsi/PRIVATE/raad/data_local
#> /rdsi/PRIVATE/raad/data_staging
#> /rdsi/PRIVATE/raad/data_deprecated
#> /rdsi/PUBLIC/raad/data'
#> Uploading raad file cache as at 2019-02-28 12:00:05 (943674 files listed)
f <- argo_files()
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
bgc <- read_prof(f$fullname[5000], select_var = c("PRES", "TEMP", "PSAL", "DOXY", "CHLA", "BBP700", "NITRATE"))
bgc
#> # A tibble: 570 x 9
#> PRES TEMP PSAL DOXY CHLA BBP700 NITRATE N_PROF N_LEVELS
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
#> 1 4.30 13.1 34.0 NA NA NA NA 1 1
#> 2 6 13.1 34 NA NA NA NA 2 1
#> 3 7.90 13.1 34 NA NA NA NA 1 2
#> 4 10 13.1 34 NA NA NA NA 2 2
#> 5 12.1 13.1 34 NA NA NA NA 1 3
#> 6 14 13.1 34 NA NA NA NA 2 3
#> 7 16 13.1 34 NA NA NA NA 1 4
#> 8 18 13.1 34 NA NA NA NA 2 4
#> 9 20 13.1 34 NA NA NA NA 1 5
#> 10 22 13.1 34 NA NA NA NA 2 5
#> # … with 560 more rows
Created on 2019-02-28 by the reprex package (v0.2.1)
(Notes provided to me)
ARGO data from ftp://ftp.ifremer.fr/ifremer/argo (FRENCH). The second global data acquisition center (GDAC) is located here ftp://www.usgodae.org/pub/outgoing/argo/ (US). I haven’t fully investigated how the two are different yet.
In the main argo/ directory
Dac/ - profiles organised by regional data acquisition centres
Geo/ - profiles organised by geographical ocean basins
Index files: argo_bio-profile_index has ONLY BGC info, argo_merge-profile_index is the index for merged profiles
In the dac/FLOATID directories
traj.files which are trajectory files
meta.nc files which record meta data files
prof.nc file which contains all profiles for that float
Mprof.nc are merged profiles with both BGC and CTD measurements
tech.nc file which is a technical file.
Calibration info is either recorded in the technical or meta files
In the dac/FLOATID/profiles/ directory
Files beginning with
R are Real-time mode and contain only CTD information (and maybe O2?)
D are in delayed mode and contain only CTD information (and maybe O2?)
BR/BD contain only biogeochemical data (no CTD) in real-time/delayed mode respectively
MR/MD are merged profiles and contain both biogeochemical data and CTD data in one file in real-time/delayed mode respectively
The merged profiles have different forms for different institutions.
The regional data acquisition centres also have their own products, but are not recorded in the GDAC. I am almost certain that all RDAC follow the same QC processes, however CSIRO profiles stop recording BGC data very shallow. I haven’t looked into why this is so yet and if they still are able to follow the same calibration procedures.