marcospereirampj / python-keycloak

MIT License
704 stars 297 forks source link

Problem with character encoding (umlauts) when creating users #507

Closed citroid closed 9 months ago

citroid commented 9 months ago

I try to create users in Keycloak based on data from a CSV export. Basically this works fine with one exception: some special characters (umlauts) in firstnames and lastnames are not properly displayed in Keycloak when I view the imported users.

Keycloak_Users

My CSV file is encoded in UTF-8 and contains username, lastname, firstname and email of the users. Example CSV:

"user_id";"lst_name";"fst_name";"email"
mmuestermann;Müstermann;Mäx;maex.muestermann@example.com

This is my code:

# Retrieve data from user CSV file
user_file = open(import_file_name, "r")
user_csv_data = list(csv.reader(user_file, delimiter=";"))
user_file.close()

# Remove first line containing CSV header
kobil_user_csv_data.pop(0)

keycloak_connection = KeycloakOpenIDConnection(
                        server_url=kc_server_url,
                        username=kc_username,
                        password=kc_password,
                        realm_name=kc_realm_name,
                        user_realm_name="master",
                        client_id="admin-cli",
                        # custom_headers={"Content-Type": "application/json; charset=utf-8"},
                        verify=True)

keycloak_admin = KeycloakAdmin(connection=keycloak_connection)

for user in user_csv_data:

    username = user[0].strip()
    lastname = user[1].strip()
    firstname = user[2].strip()
    email = user[3].strip()
    new_user_id = -1
    try:
        new_user_id = keycloak_admin.create_user({"email": email,
                                                  "username": username,
                                                  "enabled": True,
                                                  "firstName": firstname,
                                                  "lastName": lastname},
                                                  exist_ok=False)
        print("Created new user '" + username + "' (Keycloak user ID: " + new_user_id + ").")
    except KeycloakPostError as error:
        print(error)

Apart from explicitly setting the "Content-Type" header to UTF-8 charset (as seen in the above comment) I also tried to append a string conversion from UTF-8 to ISO-8859-1 to the strings firstname and lastname (with .encode('utf-8').decode('iso-8859-1')) in case Keycloaks expects them to be encoded in ISO-8859-1 (also changing the "Content-Type" header accordingly). Unfortunately this didn't work...

Did I miss something that I have to do in order to get the umlauts properly recognized or is this a limitation of the Keycloak API?

Any help would be very much appreciated.

ryshoooo commented 9 months ago

Sorry, I'm unable to reproduce this. Following your example, I got the user created with all the correct characters.

I'd suggest to double-check that the the create_user payload does not have any weird characters. Then I'd suggest to check what is the output of json.dumps(payload), where payload is the data for the create_user method. Should look something like this

>>> json.dumps({"email": "maex.muestermann@example.com", "username": "mmuestermann", "enabled": True, "firstName": "Mäx", "lastName": "Müstermann"})
'{"email": "maex.muestermann@example.com", "username": "mmuestermann", "enabled": true, "firstName": "M\\u00e4x", "lastName": "M\\u00fcstermann"}'
Screenshot 2023-11-21 at 15 06 23
citroid commented 9 months ago

Thank you for the very quick response!

I just checked your suggestion and the output for my payload did indeed differ: {"email": "maex.muestermann@example.com", "username": "mmuestermann", "enabled": true, "firstName": "M\u00c3\u00a4x", "lastName": "M\u00c3\u00bcstermann"}

It turned out that the cause of this issue was a missing encoding argument in the file open method: user_file = open(import_file_name, "r", encoding="utf-8")

Now the output is identical with the one you have posted: {"email": "maex.muestermann@example.com", "username": "mmuestermann", "enabled": true, "firstName": "M\u00e4x", "lastName": "M\u00fcstermann"}

Sorry for the wrong alert and, again, thank you for your help!