AustralianAntarcticDivision / blueant

Environmental data for Antarctic and Southern Ocean science
https://australianantarcticdivision.github.io/blueant/
Other
15 stars 2 forks source link

ARGO notes #13

Open mdsumner opened 6 years ago

mdsumner commented 6 years ago

(Notes provided to me)

ARGO data from ftp://ftp.ifremer.fr/ifremer/argo (FRENCH). The second global data acquisition center (GDAC) is located here ftp://www.usgodae.org/pub/outgoing/argo/ (US). I haven’t fully investigated how the two are different yet.

In the main argo/ directory

Index files: argo_bio-profile_index has ONLY BGC info, argo_merge-profile_index is the index for merged profiles

In the dac/FLOATID directories

Calibration info is either recorded in the technical or meta files

In the dac/FLOATID/profiles/ directory

Files beginning with

The merged profiles have different forms for different institutions.

The regional data acquisition centres also have their own products, but are not recorded in the GDAC. I am almost certain that all RDAC follow the same QC processes, however CSIRO profiles stop recording BGC data very shallow. I haven’t looked into why this is so yet and if they still are able to follow the same calibration procedures.

mdsumner commented 6 years ago

Just a link to early experiment so I don't forget https://github.com/mdsumner/rargo

mdsumner commented 5 years ago

New source:

ftp.ifremer.fr/argo/ifremer/argo/dac

ping @KimBaldry @raymondben

raymondben commented 5 years ago

@mdsumner run in progress now: files in /rdsi/PUBLIC/raad/data/www.usgodae.org/ftp/outgoing/argo/dac/ Note that I'm retrieving from the US global DAC rather than ifremer, seems faster (but as far as I can tell the data content is the same). Also am not yet doing anything with the greylist.

KimBaldry commented 5 years ago

Thanks Ben! I wouldn't worry about the greylist. At the moment the list is highly subjective as program QC procedures are being developed

mdsumner commented 5 years ago

Here's a quick look at the 15241 starting points for the files, I don't quite get the reference date yet but will figure it out:

image

I put out a new function to find the files on our system (but I can't yet update the server just yet):

https://github.com/AustralianAntarcticDivision/raadfiles/blob/master/R/raad-argo-files.R

I see ~3Gb of profile files.

mdsumner commented 5 years ago

@KimBaldry tidync now has support for these Argo files, we should look at the details but this kind of workflow should be available now on the server

read_prof <- function(x, grid = "", ...) {
  tidync::tidync(x) %>% tidync::hyper_tibble(...)
}

library(raadfiles)
#> global option 'raadfiles.data.roots' set:
#> '/rdsi/PRIVATE/raad/data
#>  /rdsi/PRIVATE/raad/data_local
#>  /rdsi/PRIVATE/raad/data_staging
#>  /rdsi/PRIVATE/raad/data_deprecated
#>  /rdsi/PUBLIC/raad/data'
#> Uploading raad file cache as at 2019-02-28 12:00:05 (943674 files listed)
f <- argo_files()

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
bgc <- read_prof(f$fullname[5000], select_var = c("PRES", "TEMP", "PSAL", "DOXY", "CHLA", "BBP700", "NITRATE"))
bgc
#> # A tibble: 570 x 9
#>     PRES  TEMP  PSAL  DOXY  CHLA BBP700 NITRATE N_PROF N_LEVELS
#>    <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>  <int>    <int>
#>  1  4.30  13.1  34.0    NA    NA     NA      NA      1        1
#>  2  6     13.1  34      NA    NA     NA      NA      2        1
#>  3  7.90  13.1  34      NA    NA     NA      NA      1        2
#>  4 10     13.1  34      NA    NA     NA      NA      2        2
#>  5 12.1   13.1  34      NA    NA     NA      NA      1        3
#>  6 14     13.1  34      NA    NA     NA      NA      2        3
#>  7 16     13.1  34      NA    NA     NA      NA      1        4
#>  8 18     13.1  34      NA    NA     NA      NA      2        4
#>  9 20     13.1  34      NA    NA     NA      NA      1        5
#> 10 22     13.1  34      NA    NA     NA      NA      2        5
#> # … with 560 more rows

Created on 2019-02-28 by the reprex package (v0.2.1)