bopen / c3s-eqc-toolbox-template

CADS Toolbox template application
Apache License 2.0
5 stars 4 forks source link

Authorize CERRA for WP5 #64

Closed sandrocalmanti closed 1 year ago

sandrocalmanti commented 1 year ago

Describe the solution you'd like

Hi @malmans2

I'm trying to work with this new dataset

# Define request
collection_id = "reanalysis-cerra-single-levels"
request = {
    "product_type": "reanalysis",
    "format": "grib",
    "variable": "snow_depth",
    "level_type": "surface_or_atmosphere",
    "time": ["00:00"],
}
requests = download.update_request_date(request, start=start, stop=stop)

and i get this request

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/cdsapi/api.py:464, in Client._api(self, url, request, method)
    459             e.append(
    460                 "To access this resource, you first need to accept the terms"
    461                 "of '%s' at %s" % (t["title"], t["url"])
    462             )
    463         error = ". ".join(e)
--> 464     raise Exception(error)
    465 else:
    466     raise

I have this dataset authorized in my personal api client. How do I use my personal credential with the download package? Or can we have CERRA authorized for WP5?

Cheers

S.

malmans2 commented 1 year ago

Hi @sandrocalmanti ,

What's the full traceback? If the problem is the authorisation, you should use your own CDS credentials, I don't know who is the owner of the shared one (probably @vincenzodetoma).

For example, put your cdsapirc in your folder on the VM, and add this on top of your notebook:

import os
os.environ["CDSAPI_RC"] = os.path.expanduser("~/calmanti_sandro/.cdsapirc")

Do you know how get your cdsapirc? See: https://cds.climate.copernicus.eu/api-how-to To autorize your account, log in and accept the term of use here: https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-cerra-single-levels?tab=form

sandrocalmanti commented 1 year ago

Thank you @malmans2

find below the traceback I get after including your lines at the top of the notebook, copy my local .cdsapirc to my home on the VM and accepting the terms of use. I will check with @vincenzodetoma as well.

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/cdsapi/api.py:442, in Client._api(self, url, request, method)
    441 try:
--> 442     result.raise_for_status()
    443     reply = result.json()

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/requests/models.py:1021, in Response.raise_for_status(self)
   1020 if http_error_msg:
-> 1021     raise HTTPError(http_error_msg, response=self)

HTTPError: 400 Client Error:  for url: https://cds.climate.copernicus.eu/api/v2/resources/%7B'product_type':%20'reanalysis',%20'format':%20'grib',%20'variable':%20'snow_depth',%20'level_type':%20'surface_or_atmosphere',%20'time':%20%5B'00:00'%5D,%20'year':%20%5B1981,%201982,%201983,%201984,%201985,%201986,%201987,%201988,%201989,%201990,%201991,%201992,%201993,%201994,%201995,%201996,%201997,%201998,%201999,%202000,%202001,%202002,%202003,%202004,%202005,%202006,%202007,%202008,%202009,%202010,%202011,%202012,%202013,%202014,%202015,%202016,%202017,%202018,%202019%5D,%20'month':%20%5B1,%202,%203,%204,%205,%206,%207,%208,%209,%2010,%2011,%2012%5D,%20'day':%20%5B1,%202,%203,%204,%205,%206,%207,%208,%209,%2010,%2011,%2012,%2013,%2014,%2015,%2016,%2017,%2018,%2019,%2020,%2021,%2022,%2023,%2024,%2025,%2026,%2027,%2028,%2029,%2030,%2031%5D%7D

During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
Cell In[18], line 1
----> 1 ds = download.download_and_transform(
      2     *requests,
      3     transform_func=regionalise_and_dayofyear_reindex,
      4     transform_func_kwargs={
      5         "lon_slice": lon_slice,
      6         "lat_slice": lat_slice,
      7         "years_start": years_start,
      8         "years_stop": years_stop,
      9     },
     10 )

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:545, in download_and_transform(collection_id, requests, chunks, split_all, transform_func, transform_func_kwargs, transform_chunks, n_jobs, invalidate_cache, cached_open_mfdataset_kwargs, **open_mfdataset_kwargs)
    540             cacholote.delete(
    541                 func.func, *func.args, request_list=[request], **func.keywords
    542             )
    543         with cacholote.config.set(return_cache_entry=True):
    544             sources.append(
--> 545                 func(request_list=[request]).result["args"][0]["href"]
    546             )
    547     ds = xr.open_mfdataset(sources, **cached_open_mfdataset_kwargs)
    548 else:
    549     # Cache final dataset transformed

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/cacholote/cache.py:86, in cacheable.<locals>.wrapper(*args, **kwargs)
     83             warnings.warn(str(ex), UserWarning)
     84             clean._delete_cache_entry(session, cache_entry)
---> 86 result = func(*args, **kwargs)
     87 cache_entry = database.CacheEntry(
     88     key=hexdigest,
     89     expiration=settings.expiration,
     90     tag=settings.tag,
     91 )
     92 try:

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:405, in _download_and_transform_requests(collection_id, request_list, transform_func, transform_func_kwargs, **open_mfdataset_kwargs)
    394 @cacholote.cacheable
    395 def _download_and_transform_requests(
    396     collection_id: str,
   (...)
    403     # However, there is not a consistent behavior across backends.
    404     # For example, GRIB silently ignore open_mfdataset_kwargs
--> 405     sources = get_sources(collection_id, request_list)
    406     try:
    407         engine = open_mfdataset_kwargs.get(
    408             "engine",
    409             {xr.backends.plugins.guess_engine(source) for source in sources},
    410         )

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:305, in get_sources(collection_id, request_list, exclude)
    302 source: set[str] = set()
    304 for request in request_list if len(request_list) == 1 else tqdm.tqdm(request_list):
--> 305     data = _cached_retrieve(collection_id, request)
    306     if content := getattr(data, "_content", None):
    307         source.update(map(str, content))

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:294, in _cached_retrieve(collection_id, request)
    292 def _cached_retrieve(collection_id: str, request: dict[str, Any]) -> emohawk.Data:
    293     with cacholote.config.set(use_cache=True, return_cache_entry=False):
--> 294         return cads_toolbox.catalogue.retrieve(collection_id, request).data

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/cads_toolbox/catalogue.py:79, in Remote.data(self)
     75 @property
     76 def data(self) -> emohawk.Data:
     77     """Object representing the requested data."""
     78     return emohawk.open(
---> 79         self.download(),
     80         exclude=["*.png", "*.json"],  # TODO: implement dataset-specific kwargs
     81     )

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/cads_toolbox/catalogue.py:68, in Remote.download(self, target)
     66 if config.USE_CACHE:
     67     with cacholote.config.set(io_delete_original=True):
---> 68         obj = cacholote.cacheable(_download)(self.collection_id, self.request)
     69     if target:
     70         obj.fs.get(obj.path, str(target))

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/cacholote/cache.py:86, in cacheable.<locals>.wrapper(*args, **kwargs)
     83             warnings.warn(str(ex), UserWarning)
     84             clean._delete_cache_entry(session, cache_entry)
---> 86 result = func(*args, **kwargs)
     87 cache_entry = database.CacheEntry(
     88     key=hexdigest,
     89     expiration=settings.expiration,
     90     tag=settings.tag,
     91 )
     92 try:

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/cads_toolbox/catalogue.py:36, in _download(collection_id, request, target)
     28 def _download(
     29     collection_id: str,
     30     request: Dict[str, Any],
   (...)
     33     fsspec.spec.AbstractBufferedFile, fsspec.implementations.local.LocalFileOpener
     34 ]:
     35     client = cdsapi.Client()
---> 36     path = client.retrieve(collection_id, request).download(target)
     37     with fsspec.open(path, "rb") as f:
     38         return f

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/cdsapi/api.py:364, in Client.retrieve(self, name, request, target)
    363 def retrieve(self, name, request, target=None):
--> 364     result = self._api("%s/resources/%s" % (self.url, name), request, "POST")
    365     if target is not None:
    366         result.download(target)

File /data/common/mambaforge/envs/wp5/lib/python3.10/site-packages/cdsapi/api.py:464, in Client._api(self, url, request, method)
    459             e.append(
    460                 "To access this resource, you first need to accept the terms"
    461                 "of '%s' at %s" % (t["title"], t["url"])
    462             )
    463         error = ". ".join(e)
--> 464     raise Exception(error)
    465 else:
    466     raise

Exception: <!doctype html><html lang="en"><head><title>HTTP Status 400 – Bad Request</title><style type="text/css">body {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 400 – Bad Request</h1><hr class="line" /><p><b>Type</b> Exception Report</p><p><b>Message</b> Invalid character found in the request target [&#47;broker&#47;api&#47;v2&#47;resources&#47;{&#39;product_type&#39;:%20&#39;reanalysis&#39;,%20&#39;format&#39;:%20&#39;grib&#39;,%20&#39;variable&#39;:%20&#39;snow_depth&#39;,%20&#39;level_type&#39;:%20&#39;surface_or_atmosphere&#39;,%20&#39;time&#39;:%20[&#39;00:00&#39;],%20&#39;year&#39;:%20[1981,%201982,%201983,%201984,%201985,%201986,%201987,%201988,%201989,%201990,%201991,%201992,%201993,%201994,%201995,%201996,%201997,%201998,%201999,%202000,%202001,%202002,%202003,%202004,%202005,%202006,%202007,%202008,%202009,%202010,%202011,%202012,%202013,%202014,%202015,%202016,%202017,%202018,%202019],%20&#39;month&#39;:%20[1,%202,%203,%204,%205,%206,%207,%208,%209,%2010,%2011,%2012],%20&#39;day&#39;:%20[1,%202,%203,%204,%205,%206,%207,%208,%209,%2010,%2011,%2012,%2013,%2014,%2015,%2016,%2017,%2018,%2019,%2020,%2021,%2022,%2023,%2024,%2025,%2026,%2027,%2028,%2029,%2030,%2031]}]. The valid characters are defined in RFC 7230 and RFC 3986</p><p><b>Description</b> The server cannot or will not process the request due to something that is perceived to be a client error (e.g., malformed request syntax, invalid request message framing, or deceptive request routing).</p><p><b>Exception</b></p><pre>java.lang.IllegalArgumentException: Invalid character found in the request target [&#47;broker&#47;api&#47;v2&#47;resources&#47;{&#39;product_type&#39;:%20&#39;reanalysis&#39;,%20&#39;format&#39;:%20&#39;grib&#39;,%20&#39;variable&#39;:%20&#39;snow_depth&#39;,%20&#39;level_type&#39;:%20&#39;surface_or_atmosphere&#39;,%20&#39;time&#39;:%20[&#39;00:00&#39;],%20&#39;year&#39;:%20[1981,%201982,%201983,%201984,%201985,%201986,%201987,%201988,%201989,%201990,%201991,%201992,%201993,%201994,%201995,%201996,%201997,%201998,%201999,%202000,%202001,%202002,%202003,%202004,%202005,%202006,%202007,%202008,%202009,%202010,%202011,%202012,%202013,%202014,%202015,%202016,%202017,%202018,%202019],%20&#39;month&#39;:%20[1,%202,%203,%204,%205,%206,%207,%208,%209,%2010,%2011,%2012],%20&#39;day&#39;:%20[1,%202,%203,%204,%205,%206,%207,%208,%209,%2010,%2011,%2012,%2013,%2014,%2015,%2016,%2017,%2018,%2019,%2020,%2021,%2022,%2023,%2024,%2025,%2026,%2027,%2028,%2029,%2030,%2031]}]. The valid characters are defined in RFC 7230 and RFC 3986
    org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:512)
    org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:503)
    org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
    org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:831)
    org.apache.tomcat.util.net.AprEndpoint$SocketWithOptionsProcessor.run(AprEndpoint.java:2045)
    java.base&#47;java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    java.base&#47;java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    java.base&#47;java.lang.Thread.run(Thread.java:829)
</pre><p><b>Note</b> The full stack trace of the root cause is available in the server logs.</p><hr class="line" /><h3>Apache Tomcat/8.5.65</h3></body></html>
malmans2 commented 1 year ago

@sandrocalmanti I think you have a bug in your code. Always try to share a minimal reproducible example, I think you didn't paste in your issue the relevant part of the code. The request is successfully queued if I do this:

from c3s_eqc_automatic_quality_control import download

import os

os.environ["CDSAPI_RC"] = os.path.expanduser("~/calmanti_sandro/.cdsapirc")

# Select time period and chunks
start = "1981"
stop = "2020"  # None: present
chunks = {"year": 1, "time": 1}

# Define request
collection_id = "reanalysis-cerra-single-levels"
request = {
    "product_type": "reanalysis",
    "format": "grib",
    "variable": "snow_depth",
    "level_type": "surface_or_atmosphere",
    "time": ["00:00"],
}
requests = download.update_request_date(request, start=start, stop=stop)

ds = download.download_and_transform(collection_id, requests, chunks=chunks)
sandrocalmanti commented 1 year ago

You're right.

I was not sending the collection-id correctly. Thanks for checking. You can close this I guess.