jpjones76 / SeisIO.jl

Julia language support for geophysical time series data
http://seisio.readthedocs.org
Other
47 stars 21 forks source link

Add Northern California Earthquake Data Center (NCEDC) for getdata(). #14

Closed kura-okubo closed 5 years ago

kura-okubo commented 5 years ago

Hello Josh,

I implemented src/Web/NCEDC.jl to download data from Northern California Earthquake Data Center (NCEDC). Here is an example: S = get_data("NCEDC","BP.RMNB..BP1",s="2002-01-02T00:00:00",t="2002-01-02T01:00:00", src="NCEDC", w=true) We need to specify src="NCEDC" in the case of request for NCEDC to assign proper NCEDC url. Note that to request for NCEDC we need headers = ["Host" => "service.ncedc.org", "User-Agent" => "curl/7.60.0", "Accept" => "*/*"] when requesting with POST. I have done test SeisIO and it succeeded, so it wouldn't have any conflict with existing functions.

Best, Kura

jpjones76 commented 5 years ago

Thank you! I really appreciate that you're all trying to help with this.

I didn't realize that the NCEDC servers aren't true FDSN. They're missing a number of specifications in Table 2 of https://www.fdsn.org/webservices/FDSN-WS-Specifications-1.1.pdf , all of which are mandatory.

Can you confirm that you've tried a manual NCEDC request with options like format (e.g., through their website), and can you confirm that requests with those options fail?

If so, there may be a shorter workaround than duplicating FDSN.jl.

Miscellaneous questions:

  1. SeisIO's current file naming convention for FDSN requests is identical to that of IRIS. Who uses year_month_JulianDay?

    • j ("Julian day") in that code is the day of the year. Today is Julian day 147 of 2019.
    • Why not split file names on "_" (underscore), parse the year and Julian date, and rename the file based on j2md(year, jday)?
  2. NCEDC is already in seis_www. Why add it twice?

  3. Are there tests to accompany this pull request, or am I expected to write those myself?

jpjones76 commented 5 years ago

With sincere apologies, I can't accept this pull request. Please read CONTRIBUTE.md before making future pull requests.

In particular:

Finally, but perhaps most importantly: as far as I can tell, the NCEDC server isn't true FDSN and it barely works.

In addition to the fact that they're missing numerous specifications in Table 2 of https://www.fdsn.org/webservices/FDSN-WS-Specifications-1.1.pdf , 100% of my data requests (and most of my station XML requests) time out. This includes requests submitted in various browsers with their web form; requests from two computers on two networks (Xfinity and Sprint); requests using Julia's HTTP package and SeisIO get_data; and requests made in two operating systems (Windows 10 and Ubuntu 18.04.2 LTS).

The result is nearly always a gateway timeout error:

ERROR: HTTP.ExceptionRequest.StatusError(504, HTTP.Messages.Response:
"""
HTTP/1.1 504 Gateway Time-out

Does their server run on dial-up? Are their computers like the ones in Captain Marvel? I don't understand. In any case, I can't justify adding this much code for an unreliable server that doesn't comply with FDSN standards.