Open IgorShcherbakov opened 1 month ago
@IgorShcherbakov I think utf-8
is the default encoding for python.
It's possible something in your environment is overriding this.
From this SO question, it looks like you can set an environment variable PYTHONIOENCODING
.
export PYTHONIOENCODING=utf8
https://stackoverflow.com/a/27066059/6304433
Let me know if that addresses the issue.
@Kilo59 print(os.getenv("PYTHONIOENCODING")) returns UTF-8
"Exception type": "UnicodeDecodeError",
"Message": "'charmap' codec can't decode byte 0x98 in position 497: character maps to <undefined>"
"Details": "Traceback (most recent call last):
File \"C:\\...\\dwh-greatexpectations\\main.py\", line 188, in get_checkpoint_result checkpoint_result = checkpoint.run(
File \"C:\\...\\dwh-greatexpectations\\venv\\Lib\\site-packages\\great_expectations\\core\\usage_statistics\\usage_statistics.py\", line 266, in usage_statistics_wrapped_method
result = func(*args, **kwargs)
File \"C:\\...\\dwh-greatexpectations\\venv\\Lib\\site-packages\\great_expectations\\checkpoint\\checkpoint.py\", line 305, in run
self._run_validation(
File \"C:\\...\\dwh-greatexpectations\\venv\\Lib\\site-packages\\great_expectations\\checkpoint\\checkpoint.py\", line 480, in _run_validation
validator: Validator = self._validator or self.data_context.get_validator(
File \"C:\\...\\dwh-greatexpectations\\venv\\Lib\\site-packages\\great_expectations\\data_context\\data_context\\abstract_data_context.py\", line 2336, in get_validator
expectation_suite = self.get_expectation_suite(
File \"C:\\...\\dwh-greatexpectations\\venv\\Lib\\site-packages\\great_expectations\\data_context\\data_context\\abstract_data_context.py\", line 3022, in get_expectation_suite
dict, self.expectations_store.get(key)
File \"C:\\...\\dwh-greatexpectations\\venv\\Lib\\site-packages\\great_expectations\\data_context\\store\\expectations_store.py\", line 210, in get
return super().get(key) # type: ignore[return-value]
File \"C:\\...\\dwh-greatexpectations\\venv\\Lib\\site-packages\\great_expectations\\data_context\\store\\store.py\", line 207, in get
value = self._store_backend.get(self.key_to_tuple(key))
File \"C:\\...\\dwh-greatexpectations\\venv\\Lib\\site-packages\\great_expectations\\data_context\\store\\_store_backend.py\", line 123, in get
value = self._get(key, **kwargs)
File \"C:\\...\\dwh-greatexpectations\\venv\\Lib\\site-packages\\great_expectations\\data_context\\store\\tuple_store_backend.py\", line 321, in _get
contents: str = infile.read().rstrip(\"\\n\")
File \"C:\\Python311\\Lib\\encodings\\cp1251.py\", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 497: character maps to <undefined>"
@IgorShcherbakov
AFAIK, there's nowhere in our codebase where we use something other than utf-8
for encoding.
I think there's something going on with your particular environment.
If you can provide a minimal reproducible example, I'd be happy to run it and could debug from there.
UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 375: character maps to
It is not possible to explicitly select the encoding!!! When opening files in the "expectations" folder, for example test.json the system value is taken, for example: with open("gx/expectations/test.json", "r") as file: ... ... ... encoding='cp1251' will be used, but I would like to use utf-8