VForWaTer / metacatalog

Modular metadata management platform for environmental data.
https://vforwater.github.io/metacatalog
GNU General Public License v3.0
4 stars 1 forks source link

Refactored Config management #305

Closed mmaelicke closed 10 months ago

mmaelicke commented 10 months ago

This PR is the starting point for moving metacatalog to pydantic. As a first step, it gets rid of the config file and replaces it with a pedantic settings object that can be configured from a .env file or environment variables.

There will be some adaptions, that I might have missed.

@AlexDo1 the export and standards extension still need to be configured as a default extension.

AlexDo1 commented 10 months ago

I just wanted to try out metacatalog with your changes. When I try to run metacatalog connection --save ... I get the a PydanticSchemaGenerationError:

Traceback (most recent call last):
  File "/home/alexd/miniconda3/bin/metacatalog", line 33, in <module>
    sys.exit(load_entry_point('metacatalog', 'console_scripts', 'metacatalog')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/bin/metacatalog", line 25, in importlib_load_entry_point
    return next(matches).load()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
    module = import_module(match.group('module'))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1126, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/alexd/Projekte/VFORWaTeR/Github/metacatalog/metacatalog/__init__.py", line 10, in <module>
    from metacatalog.config import config
  File "/home/alexd/Projekte/VFORWaTeR/Github/metacatalog/metacatalog/config.py", line 19, in <module>
    class Extension(BaseModel):
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 182, in __new__
    complete_model_class(
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 491, in complete_model_class
    schema = cls.__get_pydantic_core_schema__(cls, handler)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/main.py", line 578, in __get_pydantic_core_schema__
    return __handler(__source)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_schema_generation_shared.py", line 82, in __call__
    schema = self._handler(__source_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 468, in generate_schema
    schema = self._generate_schema(obj)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 700, in _generate_schema
    schema = self._post_process_generated_schema(self._generate_schema_inner(obj))
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 722, in _generate_schema_inner
    return self._model_schema(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 541, in _model_schema
    {k: self._generate_md_field_schema(k, v, decorators) for k, v in fields.items()},
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 541, in <dictcomp>
    {k: self._generate_md_field_schema(k, v, decorators) for k, v in fields.items()},
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 886, in _generate_md_field_schema
    common_field = self._common_field_schema(name, field_info, decorators)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 951, in _common_field_schema
    schema = self._apply_annotations(
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 1654, in _apply_annotations
    schema = get_inner_schema(source_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_schema_generation_shared.py", line 82, in __call__
    schema = self._handler(__source_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 1635, in inner_handler
    schema = self._generate_schema(obj)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 700, in _generate_schema
    schema = self._post_process_generated_schema(self._generate_schema_inner(obj))
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 727, in _generate_schema_inner
    return self.match_type(obj)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 814, in match_type
    return self._unknown_type_schema(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 366, in _unknown_type_schema
    raise PydanticSchemaGenerationError(
pydantic.errors.PydanticSchemaGenerationError: Unable to generate pydantic-core schema for FieldInfo(annotation=NoneType, required=False, default=<class 'metacatalog.ext.base.MetacatalogExtensionInterface'>, repr=False). Set `arbitrary_types_allowed=True` in the model_config to ignore this error or implement `__get_pydantic_core_schema__` on your type to fully support it.

If you got this error by calling handler(<some type>) within `__get_pydantic_core_schema__` then you likely need to call `handler.generate_schema(<some type>)` since we do not call `__get_pydantic_core_schema__` on `<some type>` otherwise to avoid infinite recursion.

For further information visit https://errors.pydantic.dev/2.5/u/schema-for-unknown-type
(base) alexd@alexd:~/Projekte/VFORWaTeR/Github/metacatalog$ metacatalog connection --save postgresql://postgres:postgres@localhost:5432/test_pydantic --name test_pydantic
Traceback (most recent call last):
  File "/home/alexd/miniconda3/bin/metacatalog", line 33, in <module>
    sys.exit(load_entry_point('metacatalog', 'console_scripts', 'metacatalog')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/bin/metacatalog", line 25, in importlib_load_entry_point
    return next(matches).load()
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
    module = import_module(match.group('module'))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1126, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/home/alexd/Projekte/VFORWaTeR/Github/metacatalog/metacatalog/__init__.py", line 10, in <module>
    from metacatalog.config import config
  File "/home/alexd/Projekte/VFORWaTeR/Github/metacatalog/metacatalog/config.py", line 19, in <module>
    class Extension(BaseModel):
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 182, in __new__
    complete_model_class(
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_model_construction.py", line 491, in complete_model_class
    schema = cls.__get_pydantic_core_schema__(cls, handler)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/main.py", line 578, in __get_pydantic_core_schema__
    return __handler(__source)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_schema_generation_shared.py", line 82, in __call__
    schema = self._handler(__source_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 468, in generate_schema
    schema = self._generate_schema(obj)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 700, in _generate_schema
    schema = self._post_process_generated_schema(self._generate_schema_inner(obj))
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 722, in _generate_schema_inner
    return self._model_schema(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 541, in _model_schema
    {k: self._generate_md_field_schema(k, v, decorators) for k, v in fields.items()},
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 541, in <dictcomp>
    {k: self._generate_md_field_schema(k, v, decorators) for k, v in fields.items()},
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 886, in _generate_md_field_schema
    common_field = self._common_field_schema(name, field_info, decorators)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 951, in _common_field_schema
    schema = self._apply_annotations(
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 1654, in _apply_annotations
    schema = get_inner_schema(source_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_schema_generation_shared.py", line 82, in __call__
    schema = self._handler(__source_type)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 1635, in inner_handler
    schema = self._generate_schema(obj)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 700, in _generate_schema
    schema = self._post_process_generated_schema(self._generate_schema_inner(obj))
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 727, in _generate_schema_inner
    return self.match_type(obj)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 814, in match_type
    return self._unknown_type_schema(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alexd/miniconda3/lib/python3.11/site-packages/pydantic/_internal/_generate_schema.py", line 366, in _unknown_type_schema
    raise PydanticSchemaGenerationError(
pydantic.errors.PydanticSchemaGenerationError: Unable to generate pydantic-core schema for FieldInfo(annotation=NoneType, required=False, default=<class 'metacatalog.ext.base.MetacatalogExtensionInterface'>, repr=False). Set `arbitrary_types_allowed=True` in the model_config to ignore this error or implement `__get_pydantic_core_schema__` on your type to fully support it.

If you got this error by calling handler(<some type>) within `__get_pydantic_core_schema__` then you likely need to call `handler.generate_schema(<some type>)` since we do not call `__get_pydantic_core_schema__` on `<some type>` otherwise to avoid infinite recursion.

For further information visit https://errors.pydantic.dev/2.5/u/schema-for-unknown-type

I also get a PydanticSchemaGenerationError when I try to import the metacatalog api in Python:

PydanticSchemaGenerationError: Unable to generate pydantic-core schema for FieldInfo(annotation=NoneType, required=False, default=<class 'metacatalog.ext.base.MetacatalogExtensionInterface'>, repr=False). Set `arbitrary_types_allowed=True` in the model_config to ignore this error or implement `__get_pydantic_core_schema__` on your type to fully support it.

If you got this error by calling handler(<some type>) within `__get_pydantic_core_schema__` then you likely need to call `handler.generate_schema(<some type>)` since we do not call `__get_pydantic_core_schema__` on `<some type>` otherwise to avoid infinite recursion.

For further information visit https://errors.pydantic.dev/2.5/u/schema-for-unknown-type
mmaelicke commented 10 months ago

Yeah, sorry I am not yet finished. I suspect there will be way more errors... You were not supposed to review as long as the CI still fails.

I'll mention you here again as soon as the CI is working.

codecov[bot] commented 10 months ago

Codecov Report

Attention: 11 lines in your changes are missing coverage. Please review.

Comparison is base (d993a47) 64.38% compared to head (881f6a0) 65.02%. Report is 5 commits behind head on main.

Files Patch % Lines
metacatalog/config.py 83.92% 9 Missing :warning:
metacatalog/ext/standards_export/extension.py 66.66% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #305 +/- ## ========================================== + Coverage 64.38% 65.02% +0.64% ========================================== Files 75 75 Lines 3931 3880 -51 ========================================== - Hits 2531 2523 -8 + Misses 1400 1357 -43 ``` | [Flag](https://app.codecov.io/gh/VForWaTer/metacatalog/pull/305/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=VForWaTer) | Coverage Δ | | |---|---|---| | [e2e](https://app.codecov.io/gh/VForWaTer/metacatalog/pull/305/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=VForWaTer) | `65.02% <88.17%> (+0.64%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=VForWaTer#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

mmaelicke commented 10 months ago

Ok @AlexDo1 this works now.

I had to disable two of the standards_export CLI test, which doesn't matter as we will rebuild the CLI anyway. I also needed to comment out the logic to get 'extra' from the config file for XML export. Here, as the parameter just pointed to a file, I think simply adding a config value to the global config object is more straightforward anyway. The user can then directly change that or use an environment variable.

I will refactor some of this stuff after January, to not create a global config

config = Config()

in metacatalog.config to a function that will instantiate a new instance, to overwrite configs programmatically.

AlexDo1 commented 10 months ago

As you removed the connection command from the CLI, it is not possible to save connections with a name, right? (I liked that feature)
So for example in upload scripts or in other CLI commands, I would always have to use the full connection string, correct?

I am still a little bit confused whether it is possible to have multiple connections with different configurations or if there is just one .env file for one metacatalog installation and all connections.

We would also need to rework the docs, as most of the installation stuff is outdated with these changes.

mmaelicke commented 10 months ago

As you removed the connection command from the CLI, it is not possible to save connections with a name, right? (I liked that feature) So for example in upload scripts or in other CLI commands, I would always have to use the full connection string, correct?

yes, or an env file or set an environment variable. That makes managing connections in container way easier

I am still a little bit confused whether it is possible to have multiple connections with different configurations or if there is just one .env file for one metacatalog installation and all connections.

Right now, there is only one global config object to make things work in the first place. From here, I like to refactor the api to create their own configs, just falling back to the global if nothing is given. That way, there would be no need to pass a session around, while you could switch configs on a call to call basis... This is just not yet implemented

We would also need to rework the docs, as most of the installation stuff is outdated with these changes.

Yeah, but I would at first remove the CLI, add FastAPI, rework the Python API, fully integrate Pydantic, possibly move away from SQLAlchemy or update to 2.0 and then update the docs. otherwise we are only re-writing the docs

AlexDo1 commented 10 months ago

Okay makes sense and also a lot of (very good) changes.

I managed to initiate a new metacatalog instance with the CLI, everything worked when giving the full connection string.
So I guess we can merge.