Closed olp-cs closed 3 months ago
Thanks so much for doing this PR! I'm on holiday this week so didn't go through thoroughly yet but it looks great and looks like you considered everything. I'll run it locally next week and then merge the PR. Can you run the API on your local machine after your changes? Thomas
On sam. 20 juil. 2024 à 19:36 olp-cs @.***> wrote:
Description Fixes #50 https://github.com/harmonydata/harmony/issues/50 — Upgrade Harmony to latest Pydantic 2.8.2 or later
Version upgrade: Pydantic 1.10.7 → 2.8.2
I updated the code according to the Pydantic V2 Migration Guide https://docs.pydantic.dev/latest/migration/:
1. Changes to pydantic.BaseModel
- The parse_obj method is deprecated; renamed to model_validate.
- The dict method is deprecated; renamed to model_dump.
2. Changes to config
- Support for class-based config is deprecated; using ConfigDict instead.
- schema_extra renamed to json_schema_extra.
3. Changes to Handling of Standard Types → Required, optional, and nullable fields
- Fields with a default value of None are now declared as Optional (to fix the error "Input should be a valid string").
Most of the changes are in src/harmony/schemas/requests/text.py. Type of change
- Bug fix (non-breaking change which fixes an issue)
- Requires a documentation revision(?)
Testing
My changes generate no new warnings.
Before the changes:
- tox -e py310: 27 passed, 38 warnings
- tox -e py39: 27 passed, 38 warnings
- tox -e py38: ERROR: No matching distribution found for numpy==1.26.4
- tox -e py37: ERROR: No matching distribution found for numpy==1.26.4
According to NumPy 1.26.4 release notes https://numpy.org/devdocs/release/1.26.4-notes.html, "The Python versions supported by this release are 3.9-3.12" (should I fill this as a separate issue?).
After the changes:
- tox -e py310: 27 passed, 38 warnings
- tox -e py39: 27 passed, 38 warnings
Test Configuration
- Library version: tox==4.16.0
- OS: macOS 11.7.10
- Toolchain: PyCharm 2024.1.4
Checklist
- My code follows the style guidelines of this project
- I have performed a self-review of my own code
- I have commented my code, particularly in hard-to-understand areas
- I have made corresponding changes to the documentation
- My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature works
- New and existing unit tests pass locally with my changes
- Any dependent changes have been merged and published in downstream modules
- I have checked my code and corrected any misspellings
You can view, comment on, or merge this pull request online at:
https://github.com/harmonydata/harmony/pull/51 Commit Summary
- 1399ad3 https://github.com/harmonydata/harmony/pull/51/commits/1399ad3ba349d32a976d0bd8b32750e7105a3386 Upgrade to Pydantic V2 (2.8.2)
- 330afd1 https://github.com/harmonydata/harmony/pull/51/commits/330afd1babc53d33c89fb1bb3251a65718a2d84e Reformat to avoid multi-line diffs
File Changes
(12 files https://github.com/harmonydata/harmony/pull/51/files)
- M pyproject.toml https://github.com/harmonydata/harmony/pull/51/files#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711 (2)
- M requirements.txt https://github.com/harmonydata/harmony/pull/51/files#diff-4d7c51b1efe9043e44439a949dfd92e5827321b34082903477fd04876edb7552 (2)
- M src/harmony/examples.py https://github.com/harmonydata/harmony/pull/51/files#diff-6f4275a1666f1455a3248b229dc8da9f41fd65829f385ebda4e6f77cf58b9975 (2)
- M src/harmony/matching/matcher.py https://github.com/harmonydata/harmony/pull/51/files#diff-3e9cfa3d12f34b9f29939718bab2203ffbb3f34e450bb7c82a87b18a7fcb7273 (2)
- M src/harmony/schemas/requests/text.py https://github.com/harmonydata/harmony/pull/51/files#diff-8f9bb7594795688f1a7646acf38acbb7813b01d28b91769d5186069300201eb6 (73)
- M src/harmony/util/instrument_helper.py https://github.com/harmonydata/harmony/pull/51/files#diff-fef76724325900be0b88a224383230a0e8230b34c45b807eb462fd2aa82a4440 (2)
- M tests/test_convert_excel_openpyxl.py https://github.com/harmonydata/harmony/pull/51/files#diff-ed977fd09851e0667c5df5c98557217a64f687dff275240c6e6ba11ba9910c26 (2)
- M tests/test_convert_excel_xlsxwriter.py https://github.com/harmonydata/harmony/pull/51/files#diff-f6dfcff36ae0b5f48dae5e9c584db9fbcac9ebd39c92215e8a2efeb8f5543427 (2)
- M tests/test_convert_pdf.py https://github.com/harmonydata/harmony/pull/51/files#diff-5eb71d7a12aa95e54175c27bc491c31af4ce83294aecdb2f19f70b53f1de9bbc (2)
- M tests/test_convert_text.py https://github.com/harmonydata/harmony/pull/51/files#diff-055025bc5c18f7f41044247eb34e0f035e943b5c1232140f47b8f963d814cf6d (2)
- M tests/test_match.py https://github.com/harmonydata/harmony/pull/51/files#diff-2b22ce50fb090a57a106007280d0785c6c040a132bff2638105523cf88cf69b3 (4)
- M tests/test_pdf_tables.py https://github.com/harmonydata/harmony/pull/51/files#diff-bc330a6d5c06a78bbfdfa573d81f19186fb351fbb4d602edcdd5daf11df588c6 (4)
Patch Links:
- https://github.com/harmonydata/harmony/pull/51.patch
- https://github.com/harmonydata/harmony/pull/51.diff
— Reply to this email directly, view it on GitHub https://github.com/harmonydata/harmony/pull/51, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADUBTVOZJCZAZCW4ULALZKDZNKU3RAVCNFSM6AAAAABLGEQKOKVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQZDCMBRGA2DEMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Thank you! I still have to set up the API repo. I'll try to run it locally this week.
Thanks @olp-cs . I have tested your fixes in the Harmony library, and it works fine. What do we need to do in the API repo? Did you get a chance to look at that? https://github.com/harmonydata/harmonyapi
I just tried running the API and I got this error:
Traceback (most recent call last):
File "/tmp/harmonyolpapi/main.py", line 38, in <module>
from harmony_api.core.settings import settings
File "/tmp/harmonyolpapi/harmony_api/core/settings.py", line 31, in <module>
from pydantic import BaseSettings
File "/tmp/harmonyolp/.venv/lib/python3.11/site-packages/pydantic/__init__.py", line 395, in __getattr__
return _getattr_migration(attr_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/harmonyolp/.venv/lib/python3.11/site-packages/pydantic/_migration.py", line 296, in wrapper
raise PydanticImportError(
pydantic.errors.PydanticImportError: `BaseSettings` has been moved to the `pydantic-settings` package. See https://docs.pydantic.dev/2.8/migration/#basesettings-has-moved-to-pydantic-settings for more details.
For further information visit https://errors.pydantic.dev/2.8/u/import-error
so we definitely need some upgrades within the API repo
Thank you! I can check which upgrades we need in the Harmony API repository. However, I encountered some problems setting it up. Some of the tests are failing for me even without any changes. I am wondering if I am doing something wrong. https://gist.github.com/olp-cs/562d80b0a6888d5e3fc4085803431bb0
Thank you so much for doing this!
I think Selenium failing is a different issue. It has started to fail recently. That is a test of the front end around the browser. I have no idea why it is failing but I noticed a few months ago. I have raised a new issue in the API repo to fix the Selenium tests -> https://github.com/harmonydata/harmonyapi/issues/7
For the OpenAI tests to run, you would need an OpenAI key in an environment variable. So you can ignore that one too. I will test that. (I think it's not the standard openAI key but an OpenAI key for calling OpenAI via the Microsoft Azure platform.) Likewise the Google Vertex AI uses an API key also but it looks like you got that working?
The remote tests are testing the API at https://api.harmonydata.ac.uk/docs, and the staging tests are testing the staging deployment at harmonystagingtmp.azurewebsites.net/docs. So both of these are not testing your code.
Finally, another thing is that the package Rocketry which we used in the API for scheduling tasks (save cache to disk periodically) hasn't been updated lately by the maintainer https://github.com/Miksus/rocketry/issues/2101, and it doesn't work with the latest version of pydantic.
You could remove Rocketry to upgrade pydantic in the API, then we'll have to find another solution for scheduling tasks. @zaironjacobs has found this other package https://github.com/agronholm/apscheduler for running tasks periodically that has no issues with the latest pydantic.
@olp-cs I used rocketry
for running tasks periodically in the api, you may encounter problems with pydantic
while upgrading. You can remove the code that uses rocketry
, I believe this is in scheduler.py
and main.py
.
After pydantic
has been upgraded, I will use this package APScheudler to add the periodic tasks again.
@olp-cs The value for the env GOOGLE_APPLICATION_CREDENTIALS
should be the JSON object. So the content of your file service-account-file.json
.
@olp-cs There is one more change required in src/harmony/schemas/responses/text.py
for the upgrade.
__root__: List[Instrument]
should be changed to RootModel(List[Instrument])
.
RootModel
can be imported from pydantic
.
Related PR in the API repo: https://github.com/harmonydata/harmonyapi/pull/8
@zaironjacobs Thank you! I have also noticed this issue with __root__
when running the API. I wrote the solution slightly differently--not sure what is the best way. But it doesn't seem to be covered by any of the existing tests in this repository.
Update: I found an explanation: both uses of RootModel
are almost identical; the only difference is that we can set model_config
when it is defined explicitly.
Description
Fixes #50 — Upgrade Harmony to latest Pydantic 2.8.2 or later
Version upgrade: Pydantic 1.10.7 → 2.8.2
I updated the code according to the Pydantic V2 Migration Guide:
1. Changes to
pydantic.BaseModel
parse_obj
method is deprecated; renamed tomodel_validate
.dict
method is deprecated; renamed tomodel_dump
.2. Changes to config
config
is deprecated; usingConfigDict
instead.schema_extra
renamed tojson_schema_extra
.3. Changes to Handling of Standard Types → Required, optional, and nullable fields
None
are now declared asOptional
(to fix the error "Input should be a valid string").Most of the changes are in
src/harmony/schemas/requests/text.py
.Type of change
Testing
My changes generate no new warnings.
Before the changes:
tox -e py310
: 27 passed, 38 warningstox -e py39
: 27 passed, 38 warningstox -e py38
: ERROR: No matching distribution found for numpy==1.26.4tox -e py37
: ERROR: No matching distribution found for numpy==1.26.4According to NumPy 1.26.4 release notes, "The Python versions supported by this release are 3.9-3.12" (should I fill this as a separate issue?).
After the changes:
tox -e py310
: 27 passed, 38 warningstox -e py39
: 27 passed, 38 warningsTest Configuration
Checklist