Closed vemonet closed 1 year ago
The reasoner-validator was removed from the ARAX system a week or two ago for this very reason. We were unable to deliver a stable product with it included. it's a pity because our viewer is choking on a lot of bad TRAPI, but without validation, no one is aware.
I don't know how to solve this problem.
Maybe one possibility is to have a reasoner-validator-lite that does not have all these dependencies.
Today we are getting a different error: AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
def validate_infores(self, context: str, edge_id: str, identifier: str):
code_prefix: str = f"error.knowledge_graph.edge.sources.retrieval_source.{context}.infores"
if not is_curie(identifier):
self.report(
code=f"{code_prefix}.not_curie",
identifier=identifier,
edge_id=edge_id
)
elif not identifier.startswith("infores:"):
# not sure how absolute the need is for this to be an Infores. We'll be lenient for now?
self.report(
code=f"{code_prefix}.invalid",
identifier=identifier,
edge_id=edge_id
)
> elif not self.bmt.get_infores_details(identifier):
E AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
/opt/hostedtoolcache/Python/3.9.16/x64/lib/python3.9/site-packages/reasoner_validator/biolink/__init__.py:698: AttributeError
I ran pytest
on the master
branch of the reasoner-validator repo and got the following errors:
FAILED tests/test_biolink_compliance_validation.py::test_validate_qualifiers[query12] - AssertionError: Unexpected messages seen {'errors': {'error.knowledge_graph.edge.qualifiers.qualifier.value.unresolved': {'UBERON:0001981': [{'edge_id': 'validate_qualifiers unit test', 'qualifier_type_id': 'biolink:anatomical_contex...
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query0] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query5] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query6] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query7] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query8] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query9] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query10] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query11] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query12] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query13] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query14] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query15] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query16] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query17] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query18] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query19] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query20] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query21] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query22] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query23] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query24] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_check_biolink_model_compliance_of_knowledge_graph[query25] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_latest_trapi_validate_sources[sources0-] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_biolink_compliance_validation.py::test_latest_trapi_validate_sources[sources8-error.knowledge_graph.edge.sources.retrieval_source.resource_id.infores.unknown] - AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
FAILED tests/test_validate.py::test_query_latest_trapi_workflow_properties[1] - jsonschema.exceptions.ValidationError: {'id': 'lookup', 'runner_parameters': {'allowlist': ['infores:aragorn']}} is not valid under any of the given schemas
FAILED tests/test_validate.py::test_query_latest_trapi_workflow_properties[1.4] - jsonschema.exceptions.ValidationError: {'id': 'lookup', 'runner_parameters': {'allowlist': ['infores:aragorn']}} is not valid under any of the given schemas
FAILED tests/test_validate.py::test_query_latest_trapi_workflow_properties[1.4.0] - jsonschema.exceptions.ValidationError: {'id': 'lookup', 'runner_parameters': {'allowlist': ['infores:aragorn']}} is not valid under any of the given schemas
FAILED tests/test_validate.py::test_query_latest_trapi_workflow_properties[1.4.0-beta4] - jsonschema.exceptions.ValidationError: {'id': 'lookup', 'runner_parameters': {'allowlist': ['infores:aragorn']}} is not valid under any of the given schemas
FAILED tests/test_workflows.py::test_query_latest_trapi_workflow_properties[1-query0] - jsonschema.exceptions.ValidationError: [{'id': 'sort_results_score', 'parameters': {'ascending_or_descending': 'ascending'}, 'runner_parameters': {'allowlist': ['infores:aragorn']}}, {'id': 'lookup', 'runner_parameters': {'allowlist'...
FAILED tests/test_workflows.py::test_query_latest_trapi_workflow_properties[1.4-query2] - jsonschema.exceptions.ValidationError: [{'id': 'sort_results_score', 'parameters': {'ascending_or_descending': 'ascending'}, 'runner_parameters': {'allowlist': ['infores:aragorn']}}, {'id': 'lookup', 'runner_parameters': {'allowlist'...
FAILED tests/test_workflows.py::test_query_latest_trapi_workflow_properties[1.4.0-query4] - jsonschema.exceptions.ValidationError: [{'id': 'sort_results_score', 'parameters': {'ascending_or_descending': 'ascending'}, 'runner_parameters': {'allowlist': ['infores:aragorn']}}, {'id': 'lookup', 'runner_parameters': {'allowlist'...
FAILED tests/test_workflows.py::test_query_latest_trapi_workflow_properties[1.4.0-beta4-query6] - jsonschema.exceptions.ValidationError: [{'id': 'sort_results_score', 'parameters': {'ascending_or_descending': 'ascending'}, 'runner_parameters': {'allowlist': ['infores:aragorn']}}, {'id': 'lookup', 'runner_parameters': {'allowlist'...
================================================================================================= 33 failed, 307 passed in 68.27s (0:01:08) ==================================================================================================
It also faces AttributeError: 'Toolkit' object has no attribute 'get_infores_details'
for a few tests
@vemonet - yes, we are removing features from BMT to make it a less heavy weight dependency per the architecture meeting yesterday. Though here we are importing KGX which is another layer of dependency that we should try to reduce (from my walk through the code for the first time yesterday, I see that we’re only really using KGX for prefix and curie resolution and it’s import of BMT). Please see my PR from yesterday evening.
Hi Vincent, your point about adding basic unit test Git actions to reasoner-validator is well taken. That said, as noted by others, we may have to start pinning more external dependencies to reliable specific versions in order to achieve stability, since most of the emergent errors in the project originate from newer releases, inadvertently pulled in, of external dependencies: several reasoner-validator dependencies are simply stated as a given release or newer
Hi @RichardBruskiewich , pinning some dependencies might be a good idea for some dependencies we know are causing trouble
But I would be cautious about pinning too many dependencies, as it will probably make it harder to import the reasoner-validator in another project, because if the reasoner-validator requires really specific dependencies versions it will easily create conflicts with other dependencies that might not require the same specific version. It will also prevent the reasoner-validator to get the newest versions of its dependencies, which could bring security fixes and performance improvements. In the end it will require the maintainers of the package to manually upgrade the dependency versions to newer versions
There is no magical solution, for a package in active development, which relies on dependencies also in active development, there will always need to be some work on maintaining dependencies versions and upgrading to newer, especially for a package that needs to be imported in other projects
In my opinion the easiest approach to easily manage a package dependencies versions from a maintainers point of view is to limit as much as possible the amount of core dependencies, provide the widest version range where the package works, and have a good automated test suite to make sure the newest versions are not breaking anything. If we see the test suite is failing due to a new dependency version, or that a specific dependency always introduce breaking changes with new minor versions, we can put a upper limit on this dependency, request a change to it, or make the change ourselves and send a PR
Currently I don't think the reasoner-validator specifically needs more tests, there is already a good test suite (new tests for new features are always welcome though!), but it needs to be more integrated to the development workflow. We should not be able to publish a new version if the tests are not passing, and we should regularly check if the tests are still passing after the latest pushed changes
If we apply this methodology to more of the packages in the translator ecosystem (bmt, oaklib, kgx...), this should help catching the issues before they go live, and reduce the amount of supply chains issues, while still effortlessly getting upgrades from newer versions
@vemonet - I agree with everything you've said here. I do want to clarify though that bmt, KGX, oaklib, linkml-runtime, and even Biolink Model all have extensive test suites, running on every PR and every merge to the main branch. We restrict the main branch on those repos so that merging is not possible without passing tests actually. This repo is the only one in that chain that was not running tests on each PR and we've fixed that this week. KGX also imports BMT and we've not seen the same issues as are reported here (or we would not have released). Same with oaklib, etc. I think inconsistent dependency management even by downstream tools (e.g. using the >=
operator in requirements files, etc. -- this is not a bad idea, its good to help with dependency churn in some cases, but is also going to allow daily changes to propagate into the repo. A locked dependency file that you can somewhat restrict might be safer), also contributes to these issues.
Describe the bug We have a really simple test that just do a request to our API in production every day and run the
reasoner-validator
to check the whole thing is compliant.Every other day the whole test fails even if we change nothing and it's always because of the reasoner-validator dependency tree
This issue has been raised a few times already in the past, and everytime it has been temporary solved, but like a snake whack-a-mole another dependency conflict is always surfacing a few days after a fix has been brought
This time it seems the issue is related to a dependency of
oaklib
:The test is here: https://github.com/MaastrichtU-IDS/knowledge-collaboratory/blob/main/backend/tests/production/test_production_api.py
And in the test environment only the following dependencies are installed (cf. dependencies and optional dependencies in https://github.com/MaastrichtU-IDS/knowledge-collaboratory/blob/main/backend/pyproject.toml#L27 )
To Reproduce Steps to reproduce the behavior:
Expected behavior Dependencies should be better defined, and better tested.
Tests have been defined in the
tests/
folder, but they are currently not run by the GitHub Actions set in this repo (there are 1 action to publish without testing, and 1 to update the docs). Ideally tests should be run at every push to the repository (like the GitHub action to update the docs), and they should test for different version of python (3.9, 3.10, 3.11 at the moment)Using the
reasoner-validator
is really important for us to make sure our output fits the ever changing TRAPI/BioLink specifications. Unfortunately the current state of thereasoner-validator
forces us to spend hours per week to fix basic dependencies issues that should be handled upstream by the reasoner-validator, bmt, oaklib, etc. Those tools and libraries should have automated tests on every commit, and those tests should capture those kind of dependency failure by running in different environment.Unfortunately at the moment our only long term solution is to do just like the
reasoner-validator
is doing right now: leaving the testing to someone else! (to the next person in the line, which are ARAs and ARX) and wait to be notified by those actors if our TRAPI payload is not compliant