Closed Criamos closed 2 months ago
Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
This PR includes the following changes:
edu_sharing_client/
)openapi-generator-cli
-generated client can be found withinedu_sharing_openapi/edu_sharing_client/
es_connector.py
API Client initialization for edu-sharing v9.x connectionses_connector.py
used just a handful of method calls from the generated API client, therefore just a few method parameters needed to be changed to fit the new API model)black
,certifi
,flake8
,httpx
,trafilatura
,pytest
,wheel
)ValidationError
s in older crawlersdockerfile
requirements.txt
in favor of using Poetry withpyproject.toml
andpoetry.lock
Attention: If you encounter
pydantic
ValidationError
s while crawlingThe Python Generator of
openapi-generator-cli
usespydantic
for its data models and validation of API calls, which allows us to catch errors before they end up in the edu-sharing back-end.When running crawlers you might encounter
pydantic
-related Validation Errors that you haven't seen before. These new error messages are super helpful when debugging and allow you to catch small oversights (like missing type-casts) before the data gets saved to the back-end.Compared to the previous API client, the boot-up time of the crawlers is expected to be slower due to the additional validation steps at warm-up. You will notice a delay when starting an individual
scrapy.Spider
before the actual crawl process begins. (We hope that futurepydantic
versions will hopefully reduce the initial boot-up time with additional optimizations.)Documentation
Since these changes required extensive research beforehand, the
oeh-search-etl
GitHub Wiki received two additional chapters to make future project maintenance a little easier to handle:openapi-generator-cli
and Dockerpoetry
commands to keep the dependencies of this project up to date