bopen / c3s-eqc-toolbox-template

CADS Toolbox template application
Apache License 2.0
5 stars 4 forks source link

Jupyter Notebook for the EOBS dataset - to be reviewed and optimised w/ eqc toolbox #32

Closed anolive closed 1 year ago

anolive commented 1 year ago

Dear Mattia and B-OPEN team,

As agreed previously, please find here attached the First Use Case/User Question for the E-OBS catalogue entry (this one: https://cds.climate.copernicus.eu/cdsapp#!/dataset/insitu-gridded-observations-europe?tab=overview).

As with the LULC, this first version uses the cdsapi and our go-to libraries in handling this type of dataset. In this case, CDO is also used (together with Xarray).

We will meanwhile work on the LULC update (issue #26).

Let us know what you think. Thank you in advance. Ana

C3S2_D520.5.3.2_Quality_Assessment_User_Questions_EOBS_UQ1_v1.zip

malmans2 commented 1 year ago

Hi @anolive,

Here is the template that reproduces the diagnostics in your notebook.

I've a doubt about the seasonal PDF. If I'm understanding correctly, you select the months first, then you compute the 15 days rolling average. In DJF for example, isn't it a problem that February affects December (and D affects F) because of the gap?

Instead, I'm computing the rolling average first, then I'm selecting the days corresponding to the season of interest. I believe this explains why my PDFs are slightly different.

Anyways, please carefully check that everything is correct and let me know.

Here is the notebook executed: for-ana.zip

ritavcunha commented 1 year ago

Dear Mattia an B-OPEN team,

Thank you for the notebook, the approach that you've used on the PDFs are great, in fact it is more correct the way that you proposed!

However, I'm having some problems regarding the "Download and Cache data". When I try to run the corresponding cell (using the notebook executed) this error occurs: "--------------------------------------------------------------------------- OperationalError Traceback (most recent call last) File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1964, in Connection._exec_single_context(self, dialect, context, statement, parameters) 1963 if not evt_handled: -> 1964 self.dialect.do_execute( 1965 cursor, str_statement, effective_parameters, context 1966 ) 1968 if self._has_events or self.engine._has_events:

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/engine/default.py:748, in DefaultDialect.do_execute(self, cursor, statement, parameters, context) 747 def do_execute(self, cursor, statement, parameters, context=None): --> 748 cursor.execute(statement, parameters)

OperationalError: no such column: cache_entries.id

The above exception was the direct cause of the following exception:

OperationalError Traceback (most recent call last) Cell In[16], line 12 1 request = ( 2 "insitu-gridded-observations-europe", 3 { (...) 10 }, 11 ) ---> 12 ds = download.download_and_transform( 13 *request, 14 transform_func=regionalise_and_dayofyear_reindex, 15 transform_func_kwargs={"lon_slice": lon_slice, "lat_slice": lat_slice}, 16 )

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/c3s_eqc_automatic_quality_control/download.py:454, in download_and_transform(collection_id, requests, chunks, split_all, transform_func, transform_func_kwargs, transform_chunks, logger, open_mfdataset_kwargs) 452 sources = [] 453 for request in tqdm.tqdm(request_list): --> 454 ds = download_and_transform_requests( 455 collection_id, 456 [request], 457 transform_func, 458 transform_func_kwargs, 459 open_mfdataset_kwargs, 460 ) 461 sources.append(ds.encoding["source"]) 462 open_mfdataset_kwargs.pop("preprocess", None) # Already preprocessed and cached

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/cacholote/cache.py:93, in cacheable..wrapper(*args, *kwargs) 90 filters.append(database.CacheEntry.expiration == settings.expiration) 92 with settings.sessionmaker() as session: ---> 93 for cache_entry in ( 94 session.query(database.CacheEntry) 95 .filter(filters) 96 .order_by(database.CacheEntry.timestamp.desc()) 97 ): 98 try: 99 return _decode_and_update(session, cache_entry, settings)

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/orm/query.py:2841, in Query.iter(self) 2840 def iter(self) -> Iterator[_T]: -> 2841 result = self._iter() 2842 try: 2843 yield from result # type: ignore

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/orm/query.py:2855, in Query._iter(self) 2852 params = self._params 2854 statement = self._statement_20() -> 2855 result: Union[ScalarResult[_T], Result[_T]] = self.session.execute( 2856 statement, 2857 params, 2858 execution_options={"_sa_orm_load_options": self.load_options}, 2859 ) 2861 # legacy: automatically set scalars, unique 2862 if result._attributes.get("is_single_entity", False):

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/orm/session.py:2229, in Session.execute(self, statement, params, execution_options, bind_arguments, _parent_execute_state, _add_event) 2168 def execute( 2169 self, 2170 statement: Executable, (...) 2176 _add_event: Optional[Any] = None, 2177 ) -> Result[Any]: 2178 r"""Execute a SQL expression construct. 2179 2180 Returns a :class:_engine.Result object representing (...) 2227 2228 """ -> 2229 return self._execute_internal( 2230 statement, 2231 params, 2232 execution_options=execution_options, 2233 bind_arguments=bind_arguments, 2234 _parent_execute_state=_parent_execute_state, 2235 _add_event=_add_event, 2236 )

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/orm/session.py:2124, in Session._execute_internal(self, statement, params, execution_options, bind_arguments, _parent_execute_state, _add_event, _scalar_result) 2119 return conn.scalar( 2120 statement, params or {}, execution_options=execution_options 2121 ) 2123 if compile_state_cls: -> 2124 result: Result[Any] = compile_state_cls.orm_execute_statement( 2125 self, 2126 statement, 2127 params or {}, 2128 execution_options, 2129 bind_arguments, 2130 conn, 2131 ) 2132 else: 2133 result = conn.execute( 2134 statement, params or {}, execution_options=execution_options 2135 )

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/orm/context.py:253, in AbstractORMCompileState.orm_execute_statement(cls, session, statement, params, execution_options, bind_arguments, conn) 243 @classmethod 244 def orm_execute_statement( 245 cls, (...) 251 conn, 252 ) -> Result: --> 253 result = conn.execute( 254 statement, params or {}, execution_options=execution_options 255 ) 256 return cls.orm_setup_cursor_result( 257 session, 258 statement, (...) 262 result, 263 )

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1414, in Connection.execute(self, statement, parameters, execution_options) 1412 raise exc.ObjectNotExecutableError(statement) from err 1413 else: -> 1414 return meth( 1415 self, 1416 distilled_parameters, 1417 execution_options or NO_OPTIONS, 1418 )

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/sql/elements.py:486, in ClauseElement._execute_on_connection(self, connection, distilled_params, execution_options) 484 if TYPE_CHECKING: 485 assert isinstance(self, Executable) --> 486 return connection._execute_clauseelement( 487 self, distilled_params, execution_options 488 ) 489 else: 490 raise exc.ObjectNotExecutableError(self)

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1638, in Connection._execute_clauseelement(self, elem, distilled_parameters, execution_options) 1626 compiled_cache: Optional[CompiledCacheType] = execution_options.get( 1627 "compiled_cache", self.engine._compiled_cache 1628 ) 1630 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache( 1631 dialect=dialect, 1632 compiled_cache=compiled_cache, (...) 1636 linting=self.dialect.compiler_linting | compiler.WARN_LINTING, 1637 ) -> 1638 ret = self._execute_context( 1639 dialect, 1640 dialect.execution_ctx_cls._init_compiled, 1641 compiled_sql, 1642 distilled_parameters, 1643 execution_options, 1644 compiled_sql, 1645 distilled_parameters, 1646 elem, 1647 extracted_params, 1648 cache_hit=cache_hit, 1649 ) 1650 if has_events: 1651 self.dispatch.after_execute( 1652 self, 1653 elem, (...) 1657 ret, 1658 )

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1842, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw) 1837 return self._exec_insertmany_context( 1838 dialect, 1839 context, 1840 ) 1841 else: -> 1842 return self._exec_single_context( 1843 dialect, context, statement, parameters 1844 )

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1983, in Connection._exec_single_context(self, dialect, context, statement, parameters) 1980 result = context._setup_result_proxy() 1982 except BaseException as e: -> 1983 self._handle_dbapi_exception( 1984 e, str_statement, effective_parameters, cursor, context 1985 ) 1987 return result

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/engine/base.py:2326, in Connection._handle_dbapi_exception(self, e, statement, parameters, cursor, context, is_sub_exec) 2324 elif should_wrap: 2325 assert sqlalchemy_exception is not None -> 2326 raise sqlalchemy_exception.with_traceback(exc_info[2]) from e 2327 else: 2328 assert exc_info[1] is not None

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/engine/base.py:1964, in Connection._exec_single_context(self, dialect, context, statement, parameters) 1962 break 1963 if not evt_handled: -> 1964 self.dialect.do_execute( 1965 cursor, str_statement, effective_parameters, context 1966 ) 1968 if self._has_events or self.engine._has_events: 1969 self.dispatch.after_cursor_execute( 1970 self, 1971 cursor, (...) 1975 context.executemany, 1976 )

File ~/miniconda3/envs/c3s_bopen/lib/python3.10/site-packages/sqlalchemy/engine/default.py:748, in DefaultDialect.do_execute(self, cursor, statement, parameters, context) 747 def do_execute(self, cursor, statement, parameters, context=None): --> 748 cursor.execute(statement, parameters)

OperationalError: (sqlite3.OperationalError) no such column: cache_entries.id [SQL: SELECT cache_entries.id AS cache_entries_id, cache_entries."key" AS cache_entries_key, cache_entries.expiration AS cache_entries_expiration, cache_entries.result AS cache_entries_result, cache_entries.timestamp AS cache_entries_timestamp, cache_entries.counter AS cache_entries_counter, cache_entries.tag AS cache_entries_tag FROM cache_entries WHERE cache_entries."key" = ? AND cache_entries.expiration > ? ORDER BY cache_entries.timestamp DESC] [parameters: ('918bb53338eddcb845a15ed51121c1ae', '2023-03-30 06:06:09.470175')] (Background on this error at: https://sqlalche.me/e/20/e3q8)"

It seems the error is on the requirements, but I checked, and I'm logged in on Copernicus Data Store, so I don't know what can be the problem...

Could you please help me how to resolve this problem?

I really appreciate any help you can provide. Thank you in advance. Rita

malmans2 commented 1 year ago

Did you update the environment recently? Unfortunately, I think you need to clear the cache. We had to change a couple of things in the cache manager to optimise it, so there are now different entries in the cache database that are not backward compatible (the environment on the VM should be already in good shape).

On your machine, I would do this.

  1. Follow the instructions here to update the environment. (Make sure c3s-eqc-toolbox-template is up to date). You need the make conda-env-update command.
  2. Then, from terminal do this: python -c "import os; import cacholote; print(os.path.dirname(cacholote.config.get().cache_files_urlpath))"
  3. Finally, delete the directory printed by the command above.
ritavcunha commented 1 year ago

@malmans2 the notebook it is now working! Thank you very much, your suggestion helped a lot!

malmans2 commented 1 year ago

Hi there, I'm closing this as looks like you are able to run the template. Feel free to re-open or open more issues though!