ArgoCanada / argoFloats

Tools for analyzing collections of oceanographic Argo floats
https://argocanada.github.io/argoFloats/index.html
17 stars 7 forks source link

End of FTP access to Argo data through US GDAC #599

Open richardsc opened 11 months ago

richardsc commented 11 months ago

From the argo-dm mailing list (see below). I don't think this affects us, but we might need to update server location to remove any FTP options that are no longer provide.

Hello,

FTP will be disallowed on the US GDAC / ARGO in the near future. We don’t know an exact date. We don’t expect to be give an exact date either. It could be as early as this month.

At that point ftp’s to ftp.usgodae.org (for puts) and usgodae.org (for gets) will cease to work.

I am in communication separately with the folks that provide the data. This email is intended for the folks that download the data via ftp.

Folks that have been using ftp to download data from the US GDAC / ARGO should transition to using wget or similar tool that works over https. The wget over https is currently supported.

There is an uncertainty of whether the US GDAC / ARGO will support SFTP downloads. The safe bet is to transition to wget over https.

Thanks,

dankelley commented 11 months ago
$ git grep -in usgodae

yields as below.

defaults.R:31:#' `argoDefaultServer()` defaults to `c("ifremer-https","usgodae")`.
defaults.R:99:    argoOptionValue("argoFloats.server", "R_ARGOFLOATS_SERVER", c("ifremer-https", "usgodae"))
extdata.R:68:#' \code{ftp://usgodae.org/pub/outgoing/argo/dac/aoml/5903586/profiles/SD5903586_001.nc}
get.R:257:#' `ftp://usgodae.org/pub/outgoing/argo`
get.R:328:#' `"ftp://usgodae.org/pub/outgoing/argo"`.
get.R:330:#' to with nicknames `"ifremer-https"`, `"ifremer"`and  `"usgodae"`.
get.R:427:    # ftp://usgodae.org/pub/outgoing/argo/dac/aoml/1900710/1900710_prof.nc
get.R:433:        "usgodae"="ftp://usgodae.org/pub/outgoing/argo")
get.R:438:        stop("server must be NULL, \"ifremer-https\", \"usgodae\", \"ifremer\", or a vector of urls, but it is ",
get.R:789:#' a different pattern on the USGODAE an Ifremer servers, and
get.R:896:        # I *thought* the USGODAE and IFREMER servers were once set up differently, with only usgodae having "dac" in the path
get.R:898:        # way, so the Ifremer case was rewritten to match the usgodae case.

so these are the things I'll need to think about.

dankelley commented 11 months ago

By the way, I do not plan to ask CRAN for an update to the package. That's because it will automatically go down through the list of known servers, if connections cannot be made.

dankelley commented 11 months ago

Um, I can't even get in to https://usgodae.org/argo/argo.html at the moment. (And that is a link referred-to in lots of places, including UCSD pages, etc.)

Possibly my machine is a bit haywire, though. I've been doing Dal-admin things and that often seems to screw up my web connections. But grading is important at this time of year.

The original email would have been more helpful if it had stated a mapping from an old address for a particular data item (ftp.SOMETHING_NEW) to a new one (SOMETHING_ELSE).

I have never found the US server to be reliable, to be honest. I guess their whole governmental system is so obsessed with border and abortion issues that other things fade into the background.

dankelley commented 11 months ago

When I use getIndex(server=“USGODAE”, debug=3) I see that the download is from

ftp://usgodae.org/pub/outgoing/argo/ar_index_global_prof.txt.gz

so the question I need to solve is where to get the file? Do I just put https: instead of ftp:?

dankelley commented 11 months ago

Hm, these two fail:

> curl::curl_download("http://usgodae.org/pub/outgoing/argo/ar_index_global_prof.txt.gz", "1.gz")
Error in curl::curl_download("http://usgodae.org/pub/outgoing/argo/ar_index_global_prof.txt.gz",  : 
  Timeout was reached: [] Failed to connect to usgodae.org port 80 after 10004 ms: Timeout was reached
> curl::curl_download("https://usgodae.org/pub/outgoing/argo/ar_index_global_prof.txt.gz", "1.gz")
Error in curl::curl_download("https://usgodae.org/pub/outgoing/argo/ar_index_global_prof.txt.gz",  : 
  Timeout was reached: [] Failed to connect to usgodae.org port 443 after 10002 ms: Timeout was reached
dankelley commented 11 months ago

These also fail

> curl::curl_download("https://www.usgodae.org/pub/outgoing/argo/ar_index_global_prof.txt.gz", "1.gz")
Error in curl::curl_download("https://www.usgodae.org/pub/outgoing/argo/ar_index_global_prof.txt.gz",  : 
  Timeout was reached: [] Failed to connect to www.usgodae.org port 443 after 10002 ms: Timeout was reached
> curl::curl_download("http://www.usgodae.org/pub/outgoing/argo/ar_index_global_prof.txt.gz", "1.gz")
Error in curl::curl_download("http://www.usgodae.org/pub/outgoing/argo/ar_index_global_prof.txt.gz",  : 
  Timeout was reached: [] Failed to connect to www.usgodae.org port 80 after 10002 ms: Timeout was reached
dankelley commented 11 months ago

@richardsc any idea of the URL we are supposed to use now? I am trying as in my notes (see previous 2 comments) and as at http://www.met.reading.ac.uk/~marc/eResearch/argo/processData/download/ but I cannot connect to whatever I try for the usgodae server.

Basically, forgetting about downloading, every web search I do for usgodae gives me a link that I cannot access. Not sure if it's my machine, though, because (as noted earlier) things get messed up when I have to do work logged in to the Dal exchange system.

dankelley commented 11 months ago

Maybe JH or other BIO staff can be entrained into this search for usgodae.

richardsc commented 11 months ago

Yeah, something seems wrong.

https://usgodae.org/argo/argo.html

Works for me, but when I click the http link there:

image

I get:

image

This is not an emergency for argoFloats, as I think we can still access the IFREMER https server, and in fact the default is to use it over the USGODAE server anyway:

If the argoFloats.server option has not been set in R, and ‘R_ARGOFLOATS_SERVER’ has not been set in the OS, then ‘argoDefaultServer()’ defaults to c("ifremer-https","usgodae")