Closed iago-pssjd closed 1 year ago
I realize now the output messages depend on previously run code (as I have tried several times to run these instructions). In my current computer I cannot run it, but I will try to produce cleaner outputs this afternoon.
cc @floriankrb (I see you are also involved in https://github.com/ecmwf-projects/mooc-machine-learning-weather-climate, where I come from)
I tried to trace cml.load_source('url', 'https://github.com/ecmwf/climetlab/raw/main/docs/examples/test.grib')
both in my computer, where it does not work, and in deepnote, where it works. I copy the first lines, where already diverges the behaviour and separate diverging behaviour blocks by two empty lines:
Input
import climetlab as cml
import cProfile
cProfile.run("cml.load_source('url', 'https://github.com/ecmwf/climetlab/raw/main/docs/examples/test.grib')", sort = 'cumulative')
Output
Deepnote (well)
15827 function calls (15819 primitive calls) in 0.271 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.271 0.271 {built-in method builtins.exec}
1 0.000 0.000 0.271 0.271 <string>:1(<module>)
1 0.000 0.000 0.271 0.271 __init__.py:155(load_source)
2 0.000 0.000 0.270 0.135 __init__.py:18(__call__)
2 0.000 0.000 0.269 0.135 caching.py:620(cache_file)
1 0.000 0.000 0.266 0.266 __init__.py:131(__call__)
1 0.000 0.000 0.266 0.266 url.py:107(__init__)
1 0.000 0.000 0.265 0.265 __init__.py:51(cache_file)
1 0.000 0.000 0.262 0.262 url.py:175(out_of_date)
1 0.000 0.000 0.262 0.262 http.py:141(out_of_date)
1 0.000 0.000 0.261 0.261 http.py:62(headers)
1 0.000 0.000 0.261 0.261 http.py:462(wrapped)
1 0.000 0.000 0.261 0.261 api.py:88(head)
1 0.000 0.000 0.261 0.261 api.py:14(request)
1 0.000 0.000 0.259 0.259 sessions.py:500(request)
2/1 0.000 0.000 0.258 0.258 sessions.py:671(send)
2 0.000 0.000 0.256 0.128 adapters.py:436(send)
2 0.000 0.000 0.255 0.127 connectionpool.py:522(urlopen)
2 0.000 0.000 0.254 0.127 connectionpool.py:361(_make_request)
2 0.000 0.000 0.194 0.097 client.py:1333(getresponse)
2 0.000 0.000 0.194 0.097 client.py:313(begin)
42 0.000 0.000 0.193 0.005 {method 'readline' of '_io.BufferedReader' objects}
2 0.000 0.000 0.193 0.097 client.py:280(_read_status)
4 0.000 0.000 0.193 0.048 socket.py:690(readinto)
4 0.000 0.000 0.193 0.048 ssl.py:1231(recv_into)
4 0.000 0.000 0.193 0.048 ssl.py:1091(read)
4 0.193 0.048 0.193 0.048 {method 'read' of '_ssl._SSLSocket' objects}
1 0.000 0.000 0.169 0.169 sessions.py:723(<listcomp>)
3/2 0.000 0.000 0.169 0.085 sessions.py:159(resolve_redirects)
2 0.000 0.000 0.059 0.030 connectionpool.py:1034(_validate_conn)
2 0.000 0.000 0.059 0.030 connection.py:356(connect)
2 0.000 0.000 0.033 0.017 connection.py:161(_new_conn)
2 0.000 0.000 0.033 0.017 connection.py:37(create_connection)
My computer (Debian 11) (bad)
CliMetLab cache: trying to free 14.9 GiB
CliMetLab cache: could not free 14.9 GiB
262886 function calls (257495 primitive calls) in 5.904 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
89/1 0.000 0.000 5.904 5.904 {built-in method builtins.exec}
1 0.000 0.000 5.904 5.904 <string>:1(<module>)
1 0.000 0.000 5.904 5.904 __init__.py:155(load_source)
2 0.000 0.000 5.682 2.841 __init__.py:18(__call__)
2 0.000 0.000 5.678 2.839 caching.py:620(cache_file)
4 0.000 0.000 5.100 1.275 caching.py:101(wrapped)
5 0.000 0.000 5.099 1.020 threading.py:280(wait)
4 0.000 0.000 5.099 1.275 caching.py:139(result)
107 5.099 0.048 5.099 0.048 {method 'acquire' of '_thread.lock' objects}
CliMetLab cache: trying to free 14.9 GiB
1 0.000 0.000 5.040 5.040 __init__.py:131(__call__)
1 0.000 0.000 4.986 4.986 url.py:107(__init__)
1 0.000 0.000 4.985 4.985 __init__.py:51(cache_file)
1 0.000 0.000 0.863 0.863 file.py:40(mutate)
1 0.000 0.000 0.863 0.863 file.py:70(_reader)
1 0.000 0.000 0.863 0.863 __init__.py:115(reader)
1 0.000 0.000 0.844 0.844 __init__.py:14(reader)
1 0.000 0.000 0.696 0.696 reader.py:21(__init__)
1 0.000 0.000 0.696 0.696 index.py:356(__init__)
1 0.000 0.000 0.695 0.695 caching.py:702(auxiliary_cache_file)
1 0.000 0.000 0.576 0.576 url.py:161(download)
1 0.000 0.000 0.576 0.576 base.py:109(download)
2 0.000 0.000 0.566 0.283 http.py:462(wrapped)
2 0.000 0.000 0.566 0.283 api.py:16(request)
Deleting entry {
"path": "/tmp/climetlab-iago/url-15280dbd4547333ede9ffec63d6959450329b9c003a148969685679b82657cba.grib",
"owner": "url",
"args": {
"url": "https://github.com/ecmwf/climetlab/raw/main/docs/examples/test.grib",
"parts": null
},
"creation_date": "2023-03-02 09:46:00.308167",
"flags": 0,
"owner_data": {
"connection": "keep-alive",
"content-length": "1052",
"cache-control": "max-age=300",
"content-security-policy": "default-src 'none'; style-src 'unsafe-inline'; sandbox",
"content-type": "application/octet-stream",
"etag": "W/\"2bd5b56b1c0727c2971a7d94f9c3f22c13a72f1d78388827fc1261b2a9530e42\"",
"strict-transport-security": "max-age=31536000",
"x-content-type-options": "nosniff",
"x-frame-options": "deny",
"x-xss-protection": "1; mode=block",
"x-github-request-id": "17FE:1218:1E5A1E4:2082EA5:640045C8",
"accept-ranges": "bytes",
"date": "Thu, 02 Mar 2023 08:46:01 GMT",
"via": "1.1 varnish",
"x-served-by": "cache-mad22078-MAD",
"x-cache": "HIT",
"x-cache-hits": "1",
"x-timer": "S1677746762.543576,VS0,VE1",
"vary": "Authorization,Accept-Encoding,Origin",
"access-control-allow-origin": "*",
"x-fastly-request-id": "47f21b8841a9a58e4a862c424aecee6504733313",
"expires": "Thu, 02 Mar 2023 08:51:01 GMT",
"source-age": "68"
},
"last_access": "2023-03-02 09:46:00.308167",
"type": "file",
"parent": null,
"replaced": null,
"extra": null,
"expires": null,
"accesses": 1,
"size": 1052
}
2 0.000 0.000 0.561 0.280 sessions.py:470(request)
4/2 0.000 0.000 0.556 0.278 sessions.py:626(send)
4 0.000 0.000 0.544 0.136 adapters.py:394(send)
4 0.000 0.000 0.538 0.134 connectionpool.py:522(urlopen)
4 0.000 0.000 0.534 0.133 connectionpool.py:361(_make_request)
4 0.000 0.000 0.389 0.097 connectionpool.py:1034(_validate_conn)
4 0.001 0.000 0.389 0.097 connection.py:356(connect)
1 0.000 0.000 0.294 0.294 http.py:249(estimate_size)
3 0.000 0.000 0.294 0.098 http.py:62(headers)
CliMetLab cache: deleting /tmp/climetlab-iago/url-15280dbd4547333ede9ffec63d6959450329b9c003a148969685679b82657cba.grib (1 KiB)
1 0.000 0.000 0.293 0.293 api.py:92(head)
CliMetLab cache: url {"url": "https://github.com/ecmwf/climetlab/raw/main/docs/examples/test.grib", "parts": null}
1 0.000 0.000 0.279 0.279 http.py:119(transfer)
1 0.000 0.000 0.273 0.273 http.py:286(make_stream)
1 0.000 0.000 0.273 0.273 http.py:212(issue_request)
1 0.000 0.000 0.273 0.273 api.py:64(get)
2 0.000 0.000 0.271 0.135 __init__.py:13(<module>)
4 0.000 0.000 0.217 0.054 ssl_.py:355(ssl_wrap_socket)
90/21 0.001 0.000 0.172 0.008 <frozen importlib._bootstrap>:1002(_find_and_load)
90/21 0.000 0.000 0.171 0.008 <frozen importlib._bootstrap>:967(_find_and_load_unlocked)
89/21 0.001 0.000 0.167 0.008 <frozen importlib._bootstrap>:659(_load_unlocked)
86/21 0.000 0.000 0.165 0.008 <frozen importlib._bootstrap_external>:784(exec_module)
99/21 0.000 0.000 0.159 0.008 <frozen importlib._bootstrap>:220(_call_with_frames_removed)
4 0.000 0.000 0.158 0.039 connection.py:161(_new_conn)
4 0.000 0.000 0.158 0.039 connection.py:37(create_connection)
Update:
I solved the issue by increasing maximum-cache-disk-usage
. But then,
# Disk usage threshold after which CliMetLab expires older cached entries (% of the full disk capacity).
When CliMetLab cache disk usage goes above this limit, CliMetLab triggers its cache cleaning mechanism before downloading additional data.
the issue is that CliMetLab is not able to expire older cache entries (CliMetLab cache: could not free 14.9 GiB
)?
Yes, it looks like there is some issue cleaning the cache. Perhaps you updated to a more recent version of climetlab ? or have you updated some of the depending packages?
To solve this you can use $ climetlab cache
and try finding and deleting the 14.9GiB entry.
If nothing works, climetlab decache --all
will clean the cache completely.
If even this fails, you could delete directly the cache folder :
$ climetlab settings cache-directory
will give you the cache directory (it seems to be /tmp/climetlab-iago in your case). Then manually delete the folder (with rm
).
@floriankrb
Thanks for your answer. I tried indeed as you suggest, removing the cache completely before executing cml.load_source('url', 'https://github.com/ecmwf/climetlab/raw/main/docs/examples/test.grib')
, and the output was the one I show above in my first comment (when I had my computer disk usage over default maximum-cache-disk-usage
= 90%).
To get it working I had to replace maximum-cache-disk-usage
with a percentage higher than my current disk usage.
On the other hand, this is an issue produced when I was trying https://github.com/ecmwf-projects/mooc-machine-learning-weather-climate/blob/main/tier_2/data_handling/01-accessing-data.ipynb. Thankfully, notebooks 2 and 3 of the same series allowed me to get a greater understanding of these issues and to arrive to the solution found.
Updated
When I try to execute the next instruction
in my computer (OS: Debian 11) I get the following output/ error message:
And if I try
then I get
Further, using
cml.load_source
always produces the messageCliMetLab cache: could not free 14.9 GiB
What may be the issues?
Thank you!