NOAA-OWP / wres

Code and scripts for the Water Resources Evaluation Service
Other
2 stars 1 forks source link

As a user, I would like WRES (ideally COWRES too) to be able to read NWM data from google cloud storage #75

Open epag opened 3 weeks ago

epag commented 3 weeks ago

Author Name: Jesse (Jesse) Original Redmine Issue: 103302, https://vlab.noaa.gov/redmine/issues/103302 Original Date: 2022-04-06


Given potential data flow issues and a desire to compare NWM datasets to either each other or to observations When I ask WRES to look at google cloud storage Then I expect it to read it successfully (because it already does so from D-Store, NOMADS, Amazon S3, and the filesystem) So that I can do the comparisons I want to do

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2022-04-06T14:29:30Z


I tried it on prod like this:

<?xml version="1.0" encoding="UTF-8"?>
<project name="Diff of d-store vs google cloud bucket (expected to fail due to missing auth to google)">
    <inputs>
        <left label="dstore">
            <type>simulations</type>
            <source interface="nwm_analysis_assim_extend_no_da_channel_rt_conus">https://nwcal-dstore.[host]/nwm/2.1/</source>
            <variable>streamflow</variable>
        </left>
        <right label="s3">
            <type>simulations</type>
            <source interface="nwm_analysis_assim_extend_no_da_channel_rt_conus">https://storage.cloud.google.com/national-water-model/</source>
            <variable>streamflow</variable>
        </right>
    </inputs>

    <pair>
        <unit>m3/s</unit>
        <feature left="18384141" />
        <leadHours minimum="-29" maximum="29" />
    <dates earliest="2022-03-29T00:00:00Z" latest="2022-04-04T00:00:00Z"/>
    <issuedDates earliest="2022-03-31T00:00:00Z" latest="2022-04-02T00:00:00Z"/>

    <validDatesPoolingWindow>
      <period>1</period>
      <unit>hours</unit>
    </validDatesPoolingWindow>
    </pair>

    <metrics>
        <metric><name>sample size</name></metric>
        <metric><name>mean absolute error</name></metric>
    </metrics>
    <outputs>
        <destination type="csv2" />
        <destination type="pairs" />
    </outputs>

</project>
</code>

The stack trace that resulted:

2022-04-06T14:21:58.509+0000 ERROR Main Operation 'execute' completed unsuccessfully
wres.pipeline.InternalWresException: Could not complete project execution
    at wres.pipeline.Evaluator.evaluate(Evaluator.java:323)
    at wres.pipeline.Evaluator.evaluate(Evaluator.java:182)
    at wres.MainFunctions.execute(MainFunctions.java:134)
    at wres.MainFunctions.call(MainFunctions.java:96)
    at wres.Main.main(Main.java:113)
Caused by: wres.pipeline.WresProcessingException: Encountered an error while processing evaluation 'ELppMCn4ZK3qB28-AXzR2de7v60': 
    at wres.pipeline.ProcessorHelper.processEvaluation(ProcessorHelper.java:274)
    at wres.pipeline.Evaluator.evaluate(Evaluator.java:298)
    ... 4 common frames omitted
Caused by: wres.pipeline.WresProcessingException: Project failed to complete with the following error: 
    at wres.pipeline.ProcessorHelper.processProjectConfig(ProcessorHelper.java:526)
    at wres.pipeline.ProcessorHelper.processEvaluation(ProcessorHelper.java:212)
    ... 5 common frames omitted
Caused by: wres.io.reading.IngestException: An ingest task could not be completed.
    at wres.io.Operations.doIngestWork(Operations.java:426)
    at wres.io.Operations.ingest(Operations.java:338)
    at wres.pipeline.ProcessorHelper.processProjectConfig(ProcessorHelper.java:381)
    ... 6 common frames omitted
Caused by: java.util.concurrent.CompletionException: wres.io.reading.PreIngestException: Failed to open netCDF resource.
    at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
    at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
    at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: wres.io.reading.PreIngestException: Failed to open netCDF resource.
    at wres.io.reading.nwm.NWMTimeSeries.<init>(NWMTimeSeries.java:209)
    at wres.io.reading.nwm.NWMReader.ingest(NWMReader.java:476)
    at wres.io.reading.nwm.NWMReader.call(NWMReader.java:358)
    at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
    ... 3 common frames omitted
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Server does not support byte Ranges
    at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
    at wres.io.reading.nwm.NWMTimeSeries.<init>(NWMTimeSeries.java:187)
    ... 6 common frames omitted
Caused by: java.io.IOException: Server does not support byte Ranges
    at ucar.unidata.io.http.HTTPRandomAccessFile.<init>(HTTPRandomAccessFile.java:97)
    at ucar.unidata.io.http.HTTPRandomAccessFile.<init>(HTTPRandomAccessFile.java:51)
    at ucar.nc2.NetcdfFile.getRaf(NetcdfFile.java:510)
    at ucar.nc2.NetcdfFile.open(NetcdfFile.java:395)
    at ucar.nc2.NetcdfFile.open(NetcdfFile.java:360)
    at ucar.nc2.NetcdfFile.open(NetcdfFile.java:344)
    at ucar.nc2.NetcdfFile.open(NetcdfFile.java:330)
    at wres.io.reading.nwm.NWMTimeSeries$NWMResourceOpener.call(NWMTimeSeries.java:1312)
    at wres.io.reading.nwm.NWMTimeSeries$NWMResourceOpener.call(NWMTimeSeries.java:1299)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    ... 3 common frames omitted

I would be surprised if the actual object storage didn't support ranges. Looking at it with curl, it appears to be a roundabout way of saying "I need authentication" (see the redirect to accounts.google.com):

$ curl -v --head https://storage.cloud.google.com/national-water-model/nwm.20220401/analysis_assim_extend_no_da/nwm.t16z.analysis_assim_extend_no_da.channel_rt.tm01.conus.nc
* STATE: INIT => CONNECT handle 0x80008f018; line 1789 (connection #-5000)
* Added connection 0. The cache now contains 1 members
* STATE: CONNECT => RESOLVING handle 0x80008f018; line 1835 (connection #0)
* family0 == v4, family1 == v6
*   Trying 142.251.45.14:443...
* STATE: RESOLVING => CONNECTING handle 0x80008f018; line 1917 (connection #0)
* Connected to storage.cloud.google.com (142.251.45.14) port 443 (#0)
* STATE: CONNECTING => PROTOCONNECT handle 0x80008f018; line 1980 (connection #0)
* ALPN, offering h2
* ALPN, offering http/1.1
*  CAfile: /etc/pki/tls/certs/ca-bundle.crt
*  CApath: none
* Didn't find Session ID in cache for host HTTPS://storage.cloud.google.com:443
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* STATE: PROTOCONNECT => PROTOCONNECTING handle 0x80008f018; line 2000 (connection #0)
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=*.googlecode.com
*  start date: Mar 17 11:13:49 2022 GMT
*  expire date: Jun  9 11:13:48 2022 GMT
*  subjectAltName: host "storage.cloud.google.com" matched cert's "*.cloud.google.com"
*  issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1C3
*  SSL certificate verify ok.
* STATE: PROTOCONNECTING => DO handle 0x80008f018; line 2019 (connection #0)
* Using HTTP2, server supports multiplexing
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x80008f018)
> HEAD /national-water-model/nwm.20220401/analysis_assim_extend_no_da/nwm.t16z.analysis_assim_extend_no_da.channel_rt.tm01.conus.nc HTTP/2
> Host: storage.cloud.google.com
> user-agent: curl/7.80.0
> accept: */*
>
* STATE: DO => DID handle 0x80008f018; line 2099 (connection #0)
* multi changed, check CONNECT_PEND queue!
* STATE: DID => PERFORMING handle 0x80008f018; line 2218 (connection #0)
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Didn't find Session ID in cache for host HTTPS://storage.cloud.google.com:443
* Added Session ID to cache for HTTPS://storage.cloud.google.com:443 [server]
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* Found Session ID in cache for host HTTPS://storage.cloud.google.com:443
* old SSL session ID is stale, removing
* Added Session ID to cache for HTTPS://storage.cloud.google.com:443 [server]
* HTTP/2 found, allow multiplexing
< HTTP/2 302
HTTP/2 302
< content-type: application/binary
content-type: application/binary
< location: https://accounts.google.com/ServiceLogin?service=cds&passive=1209600&continue=https://storage.cloud.google.com/national-water-model/nwm.20220401/analysis_assim_extend_no_da/nwm.t16z.analysis_assim_extend_no_da.channel_rt.tm01.conus.nc&followup=https://storage.cloud.google.com/national-water-model/nwm.20220401/analysis_assim_extend_no_da/nwm.t16z.analysis_assim_extend_no_da.channel_rt.tm01.conus.nc
location: https://accounts.google.com/ServiceLogin?service=cds&passive=1209600&continue=https://storage.cloud.google.com/national-water-model/nwm.20220401/analysis_assim_extend_no_da/nwm.t16z.analysis_assim_extend_no_da.channel_rt.tm01.conus.nc&followup=https://storage.cloud.google.com/national-water-model/nwm.20220401/analysis_assim_extend_no_da/nwm.t16z.analysis_assim_extend_no_da.channel_rt.tm01.conus.nc
< content-length: 0
content-length: 0
< report-to: {"group":"ATmXEA_u_QYNPUEYJqM0rPQf_UXcdXOzYIbdkLtbY_ev79gHwWch9Ut5","max_age":2592000,"endpoints":[{"url":"https://csp.withgoogle.com/csp/report-to/encsid_ATmXEA_u_QYNPUEYJqM0rPQf_UXcdXOzYIbdkLtbY_ev79gHwWch9Ut5"}]}
report-to: {"group":"ATmXEA_u_QYNPUEYJqM0rPQf_UXcdXOzYIbdkLtbY_ev79gHwWch9Ut5","max_age":2592000,"endpoints":[{"url":"https://csp.withgoogle.com/csp/report-to/encsid_ATmXEA_u_QYNPUEYJqM0rPQf_UXcdXOzYIbdkLtbY_ev79gHwWch9Ut5"}]}
< cross-origin-opener-policy-report-only: same-origin; report-to="ATmXEA_u_QYNPUEYJqM0rPQf_UXcdXOzYIbdkLtbY_ev79gHwWch9Ut5"
cross-origin-opener-policy-report-only: same-origin; report-to="ATmXEA_u_QYNPUEYJqM0rPQf_UXcdXOzYIbdkLtbY_ev79gHwWch9Ut5"
< date: Wed, 06 Apr 2022 14:24:32 GMT
date: Wed, 06 Apr 2022 14:24:32 GMT
< server: ESF
server: ESF
< x-xss-protection: 0
x-xss-protection: 0
< x-frame-options: SAMEORIGIN
x-frame-options: SAMEORIGIN
< x-content-type-options: nosniff
x-content-type-options: nosniff
< alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

<
* STATE: PERFORMING => DONE handle 0x80008f018; line 2417 (connection #0)
* multi_done
* Connection #0 to host storage.cloud.google.com left intact
* Expire cleared (transfer 0x80008f018)
epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2022-04-06T15:07:03Z


See https://cloud.google.com/storage/docs/authentication#libauth and following.

Due to the complexity of managing and refreshing access tokens and the security risk when dealing directly with cryptographic applications, we strongly encourage you to use a verified client library.

Verified client libraries? Those appear to be talking about more read/write operations such as "create a bucket" or "move these objects from bucket A to bucket B."

https://cloud.google.com/storage/docs/downloading-objects#downloading-an-object

It looks like we would need to add an HTTP header, which might be tricky considering the netcdf-java library is in charge of sending the requests. Then again, it is using (for now) apache httpclient, so we can probably get it in there. We wouldn't want the credentials to show up in project declarations, however, so we want to either have a COWRES service account that we use by default when accessing the gcloud or we would want to let callers specify a location to read the authorization token. That latter option then exposes the authorization to other users, unfortunately.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2022-04-06T15:12:56Z


Added checklist.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2022-04-06T15:16:14Z


Last comment for today, Range requests are supported: https://cloud.google.com/storage/docs/downloading-objects#downloading-an-object-portion

epag commented 3 weeks ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2023-03-29T17:18:37Z


James:

I'm asking in this ticket, because it came up in a search for "S3 bucket" at the top...

Can you confirm? Does the WRES currently have the ability to read from Amazon S3?

If I find a better ticket in which to ask this question, I'll move it. Thanks,

Hank

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2023-03-29T17:23:08Z


No. All S3 utilities were removed in commit:wres|90477b619b0d4b4ab4a925eb37bb956b26883911.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2023-03-29T17:27:24Z


Got it. Thanks. Weird that that commit is 3 years old, yet this ticket was written only a year ago stating in the Description,

Then I expect it to read it successfully (because it already does so from D-Store, NOMADS, Amazon S3, and the filesystem)

Whatever. Chris asked about Amazon S3, so the MaaS/NextGen folks may be looking into using datasets from Amazon.

Hank

epag commented 3 weeks ago

Original Redmine Comment Author Name: James (James) Original Date: 2023-03-29T17:41:22Z


Ah, I think I misunderstood your question, but now I see the title of this ticket. If you're asking about the S3-hosted NWM data, specifically, then I think that should be handled by the NWM reading code and/or the @WebClient@, as indicated in the commit message. In that regard, S3 is like d-store, the same code is used for all hosts. However, we're talking about unauthenticated hosts. The NWM data always has a very specific API/directory structure. You would need to look through #65216 for details and perhaps try one or two evaluations in there, but I think the S3-hosted NWM data should still be supported. Of course, it may be broken as no one uses it and we don't have unit or system tests for that.

epag commented 3 weeks ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2023-03-29T18:12:09Z


Got it. Thanks,

Hank