Open da-sbarde opened 1 year ago
Hi @da-sbarde
Could you share your config file (great_expectations.yml), the code block that generates this error, and the full stack trace so we can investigate this issue?
Sure, Here is the full traceback-
2023-05-10 13:14:56,378 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(77)): Error from Python:Traceback (most recent call last):
File "/tmp/transformer.py", line 3, in
And this is the yaml config string I am using-
f""" config_version: 3.0 datasources: spark_s3: module_name: great_expectations.datasource class_name: Datasource execution_engine: module_name: great_expectations.execution_engine class_name: SparkDFExecutionEngine data_connectors: default_inferred_data_connector_name: class_name: InferredAssetS3DataConnector bucket: {data_connector_bucket} prefix: {data_connector_prefix} default_regex: pattern: (.*) group_names:
data_asset_name default_runtime_data_connector_name: batch_identifiers:
config_variables_file_path: {config_variable_file_path}
plugins_directory: {plugins_dir}
stores: expectations_S3_store: class_name: ExpectationsStore store_backend: class_name: TupleS3StoreBackend bucket: {expectations_bucket} prefix: {expectations_prefix}
validations_S3_store: class_name: ValidationsStore store_backend: class_name: TupleS3StoreBackend bucket: {validation_bucket} prefix: {validation_prefix}
evaluation_parameter_store: class_name: EvaluationParameterStore
checkpoint_S3_store: class_name: CheckpointStore store_backend: class_name: TupleS3StoreBackend bucket: {checkpoint_bucket} prefix: {checkpoint_prefix}
expectations_store_name: expectations_S3_store validations_store_name: validations_S3_store evaluation_parameter_store_name: evaluation_parameter_store checkpoint_store_name: checkpoint_S3_store
data_docs_sites: s3_site: class_name: SiteBuilder show_how_to_buttons: true store_backend: class_name: TupleS3StoreBackend bucket: {doc_bucket} prefix: {doc_prefix} site_index_builder: class_name: DefaultSiteIndexBuilder"""
Thanks @da-sbarde! It seems there is a compatibility issue between boto3 and urllib3. Are you able to pin urllib3<2
in your environment?
More here: https://github.com/boto/botocore/issues/2926#issuecomment-1538900780
Hi @tjholsman, since we are using Great Expectations along with AWS glue, I don't think we can use the particular version of urllib3 as this would cause issues with other implementations.
Tried to test with latest version of GE, I am still facing same issue. ImportError: cannot import name 'DEFAULTCIPHERS' from 'urllib3.util.ssl' (/home/spark/.local/lib/python3.10/site-packages/urllib3/util/ssl_.py)
@tjholsman Do you or anyone has any update on this issue?
@jayesh-patel-ig
have you tried adding the key-value under the job parameters section of job details tab in aws glue console, like this:
the addition of urllib3<2 should fix the above issue
@lodsdevera tried above suggestion it results in below error- _RefResolutionError: 'bytes' object has no attribute 'timeout'.
This is the complete traceback.
File "/home/spark/.local/lib/python3.10/site-packages/jsonschema/validators.py", line 1087, in resolve_from_url document = self.store[url] File "/home/spark/.local/lib/python3.10/site-packages/jsonschema/_utils.py", line 20, in getitem return self.store[self.normalize(uri)] KeyError: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/spark/.local/lib/python3.10/site-packages/jsonschema/validators.py", line 1090, in resolve_from_url document = self.resolve_remote(url) File "/home/spark/.local/lib/python3.10/site-packages/jsonschema/validators.py", line 1194, in resolve_remote with urlopen(uri) as url: File "/usr/local/lib/python3.10/urllib/request.py", line 216, in urlopen return opener.open(url, data, timeout) File "/usr/local/lib/python3.10/urllib/request.py", line 509, in open req.timeout = timeout AttributeError: 'bytes' object has no attribute 'timeout'
@da-sbarde ohh, I'm encountering this also last night after replying here. Before last night, our glue job with GX is working fine so there must be something else now. Haven't tried running it today. I also tried fixing the GX version to < 0.17 in that additional-python-modules parameter but still the same error
Not sure who to tag here now regarding this
@da-sbarde retried running the same glue job without any changes and the error was not encountered now
@lodsdevera , the glue Job without specifying the great_expectations version worked for a while but it is giving same error now. It is working with version = 0.16.7 though.
ImportError: cannot import name 'DEFAULTCIPHERS' from 'urllib3.util.ssl'.
Below is the full traceback.
File "/home/spark/.local/lib/python3.10/site-packages/boto3/init.py", line 17, in
@lodsdevera any luck with the issue?
We have a glue2.0 script which uses Great Expectations module. The jobs are failing for the version > 0.16.7 with error- ImportError: cannot import name 'DEFAULTCIPHERS' from 'urllib3.util.ssl' (/home/spark/.local/lib/python3.10/site-packages/urllib3/util/ssl_.py)
Everything works fine for Great Expectation version = 0.16.7 I am using spark_s3 as datasource, S3 as the backend_store.