Closed culpgrant closed 1 week ago
Well, at least importing zope.interface
(and twisted
) instead of scrapy
doesn't reproduce the error (I really hoped that will be the problem).
I was able to reproduce this issue by importing twisted.ssl.Certificate
:
(great_expectations) ➜ scrapy git:(master) ✗ python
Python 3.10.14 (main, Mar 19 2024, 21:46:16) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import twisted.internet.ssl
>>> import great_expectations
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/__init__.py", line 32, in <module>
register_core_expectations()
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/registry.py", line 187, in register_core_expectations
from great_expectations.expectations import core # noqa: F401
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/core/__init__.py", line 1, in <module>
from .expect_column_distinct_values_to_be_in_set import (
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/core/expect_column_distinct_values_to_be_in_set.py", line 12, in <module>
from great_expectations.expectations.expectation import (
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 2350, in <module>
class BatchExpectation(Expectation, ABC):
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 287, in __new__
newclass._register_renderer_functions()
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 369, in _register_renderer_functions
attr_obj: Callable = getattr(cls, candidate_renderer_fn_name)
AttributeError: __provides__. Did you mean: '__providedBy__'?
Importing Certificate
directly from its internal package seems to work:
(great_expectations) ➜ scrapy git:(master) ✗ python
Python 3.10.14 (main, Mar 19 2024, 21:46:16) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from twisted.internet._sslverify import Certificate
>>> import great_expectations
So it must be related to twisted.internet.ssl
init code
I kept digging in Twisted code and the culprit seems to be the class BaseConnector(ABC)
class at https://github.com/twisted/twisted/blob/1c80aad4c8fd2d0142433476bd5f6df5c511b4ba/src/twisted/internet/base.py#L1224
For some reason, the implementer
decorator adds __provides__
to both BaseConnector
and ABC
classes:
>>> from zope.interface import classImplements, implementer
>>> from twisted.internet.interfaces import IConnector
>>> from abc import ABC
>>> @implementer(IConnector)
... class Test2(ABC):
... pass
>>> import great_expectations
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/__init__.py", line 32, in <module>
register_core_expectations()
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/registry.py", line 187, in register_core_expectations
from great_expectations.expectations import core # noqa: F401
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/core/__init__.py", line 1, in <module>
from .expect_column_distinct_values_to_be_in_set import (
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/core/expect_column_distinct_values_to_be_in_set.py", line 12, in <module>
from great_expectations.expectations.expectation import (
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 2350, in <module>
class BatchExpectation(Expectation, ABC):
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 287, in __new__
newclass._register_renderer_functions()
File "/Users/zyte/workspace/opensource/scrapy/.tox/great_expectations/lib/python3.10/site-packages/great_expectations/expectations/expectation.py", line 369, in _register_renderer_functions
attr_obj: Callable = getattr(cls, candidate_renderer_fn_name)
AttributeError: __provides__. Did you mean: '__providedBy__'?```
@VMRuiz Thank you for looking into this! Do you think I should create an issue with zope?
To be honest, I don't know if this is it a problem with Zope or a bad implementation by Twisted lib. @wRAR What do you think?
As a workaround for Scrapy, maybe could import from twisted.internet._sslverify import Certificate
in the meantime to avoid these side effects? There is some risk of this breaking in the future but I wouldn't expect great changes from Twisted at this point.
My first thought was also "I don't know if this is it a problem with Zope or a bad implementation by Twisted lib", as I'm not familiar with the zope.interface internals.
@culpgrant
Steps to Reproduce This does work:
import great_expectations
import scrapy
If this works. What prevent You to just use this import order in Your task?
Counthing https://github.com/great-expectations/great_expectations/issues/9698#issuecomment-2051252373 I think that this issue is not related to scrapy and it's root-cause is 100% in GreatExpectations codebase (it can be solved by adding simple try except block around line tha gave AttributeError
).
try...except Block: Wrap the import statements in a try...except block to gracefully handle the import error and provide informative messages: try: import great_expectations import scrapy except ImportError as e: print(f"Error importing libraries: {e}")
If you don't need Great Expectations functionalities throughout your script, consider delaying the import until the point of use with a function
import scrapy
def use_great_expectations(): from great_expectations import some_great_expectations_function # Import only when needed
print("Great Expectations used")
use_great_expectations()
Make sure you're using a clean virtual environment to avoid conflicts with other installed packages. Reinstall Scrapy and Great Expectations in a fresh environment to see if it resolves the issue. Check for version compatibility between Scrapy and Great Expectations
This was determined to be a great expectations - issue.
the real question is why is the import modifying a dependency instead of making a duplicate and modifying the copy if thats whats going on i think its bad practice.
Description
I am trying to use Scrapy and Great Expectations in the same virtual environment but there is an issue depending on the order I import the packages in.
I created an issue for Great Expectations with additional details.
They were mentioning it might be something with abc being monkey-patched.
Steps to Reproduce
This does work:
This does not work:
Error:
Expected behavior: Be able to use the packages together in the same virtual environment
Actual behavior: Cannot import the packages together
Reproduces how often: 100%
Versions
Scrapy 2.11.1 great-expectations 0.18.12
Additional context
Looking for a possible solution on what could be done. Thank you!