harmonydata / harmony

The Harmony Python library: a research tool for psychologists to harmonise data and questionnaire items. Open source.
https://harmonydata.ac.uk
MIT License
7 stars 12 forks source link

Upgrade to Pydantic 2.8.2 #51

Closed olp-cs closed 2 weeks ago

olp-cs commented 1 month ago

Description

Fixes #50 — Upgrade Harmony to latest Pydantic 2.8.2 or later

Version upgrade: Pydantic 1.10.7 → 2.8.2

I updated the code according to the Pydantic V2 Migration Guide:

1. Changes to pydantic.BaseModel

2. Changes to config

3. Changes to Handling of Standard Types → Required, optional, and nullable fields

Most of the changes are in src/harmony/schemas/requests/text.py.

Type of change

Testing

My changes generate no new warnings.

Before the changes:

According to NumPy 1.26.4 release notes, "The Python versions supported by this release are 3.9-3.12" (should I fill this as a separate issue?).

After the changes:

Test Configuration

Checklist

woodthom2 commented 1 month ago

Thanks so much for doing this PR! I'm on holiday this week so didn't go through thoroughly yet but it looks great and looks like you considered everything. I'll run it locally next week and then merge the PR. Can you run the API on your local machine after your changes? Thomas

On sam. 20 juil. 2024 à 19:36 olp-cs @.***> wrote:

Description Fixes #50 https://github.com/harmonydata/harmony/issues/50 — Upgrade Harmony to latest Pydantic 2.8.2 or later

Version upgrade: Pydantic 1.10.7 → 2.8.2

I updated the code according to the Pydantic V2 Migration Guide https://docs.pydantic.dev/latest/migration/:

1. Changes to pydantic.BaseModel

  • The parse_obj method is deprecated; renamed to model_validate.
  • The dict method is deprecated; renamed to model_dump.

2. Changes to config

  • Support for class-based config is deprecated; using ConfigDict instead.
  • schema_extra renamed to json_schema_extra.

3. Changes to Handling of Standard Types → Required, optional, and nullable fields

  • Fields with a default value of None are now declared as Optional (to fix the error "Input should be a valid string").

Most of the changes are in src/harmony/schemas/requests/text.py. Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Requires a documentation revision(?)

Testing

My changes generate no new warnings.

Before the changes:

  • tox -e py310: 27 passed, 38 warnings
  • tox -e py39: 27 passed, 38 warnings
  • tox -e py38: ERROR: No matching distribution found for numpy==1.26.4
  • tox -e py37: ERROR: No matching distribution found for numpy==1.26.4

According to NumPy 1.26.4 release notes https://numpy.org/devdocs/release/1.26.4-notes.html, "The Python versions supported by this release are 3.9-3.12" (should I fill this as a separate issue?).

After the changes:

  • tox -e py310: 27 passed, 38 warnings
  • tox -e py39: 27 passed, 38 warnings

Test Configuration

  • Library version: tox==4.16.0
  • OS: macOS 11.7.10
  • Toolchain: PyCharm 2024.1.4

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

You can view, comment on, or merge this pull request online at:

https://github.com/harmonydata/harmony/pull/51 Commit Summary

File Changes

(12 files https://github.com/harmonydata/harmony/pull/51/files)

Patch Links:

— Reply to this email directly, view it on GitHub https://github.com/harmonydata/harmony/pull/51, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADUBTVOZJCZAZCW4ULALZKDZNKU3RAVCNFSM6AAAAABLGEQKOKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQZDCMBRGA2DEMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

olp-cs commented 1 month ago

Thank you! I still have to set up the API repo. I'll try to run it locally this week.

woodthom2 commented 1 month ago

Thanks @olp-cs . I have tested your fixes in the Harmony library, and it works fine. What do we need to do in the API repo? Did you get a chance to look at that? https://github.com/harmonydata/harmonyapi

woodthom2 commented 1 month ago

I just tried running the API and I got this error:

Traceback (most recent call last):
  File "/tmp/harmonyolpapi/main.py", line 38, in <module>
    from harmony_api.core.settings import settings
  File "/tmp/harmonyolpapi/harmony_api/core/settings.py", line 31, in <module>
    from pydantic import BaseSettings
  File "/tmp/harmonyolp/.venv/lib/python3.11/site-packages/pydantic/__init__.py", line 395, in __getattr__
    return _getattr_migration(attr_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/tmp/harmonyolp/.venv/lib/python3.11/site-packages/pydantic/_migration.py", line 296, in wrapper
    raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.8/migration/#basesettings-has-moved-to-pydantic-settings for more details.

For further information visit https://errors.pydantic.dev/2.8/u/import-error

so we definitely need some upgrades within the API repo

olp-cs commented 1 month ago

Thank you! I can check which upgrades we need in the Harmony API repository. However, I encountered some problems setting it up. Some of the tests are failing for me even without any changes. I am wondering if I am doing something wrong. https://gist.github.com/olp-cs/562d80b0a6888d5e3fc4085803431bb0

woodthom2 commented 1 month ago

Thank you so much for doing this!

I think Selenium failing is a different issue. It has started to fail recently. That is a test of the front end around the browser. I have no idea why it is failing but I noticed a few months ago. I have raised a new issue in the API repo to fix the Selenium tests -> https://github.com/harmonydata/harmonyapi/issues/7

For the OpenAI tests to run, you would need an OpenAI key in an environment variable. So you can ignore that one too. I will test that. (I think it's not the standard openAI key but an OpenAI key for calling OpenAI via the Microsoft Azure platform.) Likewise the Google Vertex AI uses an API key also but it looks like you got that working?

The remote tests are testing the API at https://api.harmonydata.ac.uk/docs, and the staging tests are testing the staging deployment at harmonystagingtmp.azurewebsites.net/docs. So both of these are not testing your code.

Finally, another thing is that the package Rocketry which we used in the API for scheduling tasks (save cache to disk periodically) hasn't been updated lately by the maintainer https://github.com/Miksus/rocketry/issues/2101, and it doesn't work with the latest version of pydantic.

You could remove Rocketry to upgrade pydantic in the API, then we'll have to find another solution for scheduling tasks. @zaironjacobs has found this other package https://github.com/agronholm/apscheduler for running tasks periodically that has no issues with the latest pydantic.

zaironjacobs commented 1 month ago

@olp-cs I used rocketry for running tasks periodically in the api, you may encounter problems with pydantic while upgrading. You can remove the code that uses rocketry, I believe this is in scheduler.py and main.py.

After pydantic has been upgraded, I will use this package APScheudler to add the periodic tasks again.

zaironjacobs commented 1 month ago

@olp-cs The value for the env GOOGLE_APPLICATION_CREDENTIALS should be the JSON object. So the content of your file service-account-file.json.

zaironjacobs commented 3 weeks ago

@olp-cs There is one more change required in src/harmony/schemas/responses/text.py for the upgrade.

__root__: List[Instrument] should be changed to RootModel(List[Instrument]).

RootModel can be imported from pydantic.

olp-cs commented 3 weeks ago

Related PR in the API repo: https://github.com/harmonydata/harmonyapi/pull/8

@zaironjacobs Thank you! I have also noticed this issue with __root__ when running the API. I wrote the solution slightly differently--not sure what is the best way. But it doesn't seem to be covered by any of the existing tests in this repository.

Update: I found an explanation: both uses of RootModel are almost identical; the only difference is that we can set model_config when it is defined explicitly.