Open Eorlariel opened 1 week ago
One possibility is to use this code snippet to guess the encoding with a certain confidence:
import chardet
with open('example.txt', 'rb') as file:
result = chardet.detect(file.read())
encoding = result['encoding']
confidence = result['confidence']
print(f"The file is encoded in '{encoding}' with confidence {confidence * 100:.2f}%.")
If the confidence is above a threshold, we could take it as granted. We could add a command line flag like --detect-encoding
to enable this feature.
Currently umlaute as "ä", "ö" a.s.o are failing in
lobster-json
if the json file is saved with encoding ISO-8859-1, becauselobster-json
is trying to read it with utf-8 encoding.lobster-json
should not fail in these cases, but should support also other encodings.This may be a solution on how to detect the encoding: https://www.powershellgallery.com/packages/poshfunctions/2.2.1.1/content/functions/get-fileencoding.ps1
Acceptance Criterias: lobster doesn't fail when using "Umlaute" in testcase json files in common encodings like: