sol1 / icingaweb2-module-netbox

Netbox importer for director, and integration with netbox
28 stars 6 forks source link

Error importing VM with Emoji in comments to db with utf8 character set #40

Closed yaiqsa closed 3 weeks ago

yaiqsa commented 2 months ago

Hey,

First of all: Cool module! Amazing work πŸ‘πŸ»

While playing with an Netbox Import Source in Icinga I came across the following error:

This Import Source failed when last checked at 2024-07-05 15:30:43: SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: '\xF0\x9F\x98\x8A\x0D\x0A...' for column `icingadirector`.`imported_property`.`property_value` at row 1, query was: INSERT INTO imported_property (checksum, property_name, property_value, format) VALUES (0x9b33a06376a6e31babe4f1438b5768ae759686bd, ?, ?, ?)

After a lot of debugging, I found out this error is thrown when importing a VM that has an emoji ( 😊 to be precise) in its comments.

After a a bit more searching, I found that my table (and column) were still using character set utf8, instead of utf8mb4. This page describes the problem & possible solutions nicely: https://mathiasbynens.be/notes/mysql-utf8mb4

Is it realistic that I'm not the only one with a utf8 database, and emoji in their Netbox object descriptions? If so, is it worthwhile mentioning something about it/fixing it? (Like replacing offending characters during import, if the wrong character class is detected?)

In any case, people looking for the same error as I had might land on this issue and find their problem πŸ˜…

My environment:

Netbox v4.0.6

Icinga Web 2 Version 2.12.1
director           1.11.0
netbox             3.6.1.2

The Director db recourse is a MariaDB database,
with Character Set utf8mb4 set in icinga, but utf8 for most tables in the db.
sol1-matt commented 1 month ago

Hi @yaiqsa,

Thanks for the info.

That errror is coming from Director itself rather than the Netbox import module. The Netbox import module just turns the Netbox data into a format Director can parse, along with a bunch a sanitization. So potentially we could sanitize every field coming in so utf8 is supported.

While I can't think of a specific instance doing this would create a problem, why would anybody need a emoji in Icinga, I'm not omniscient and think it would be better to leave the solution to Director or converting your db from utf8 to utf8mb4 (which I think is director's solution). Trying to implement a solution in the Netbox import module could open a can of worms.

For reference a possible way to implement this in any import module would be a option that enables the sanatization using a checkbox which removes UTF-8-encoded symbols that take up four bytes.

A quick look turns up this

preg_replace('/[\x{10000}-\x{10FFFF}]/u', "\xEF\xBF\xBD", $value);
yaiqsa commented 3 weeks ago

Thank you for your reply. I think you are right, and my problem is mostly caused by a misconfiguration.

I'll close the issue, because I agree that this isn't necessarily something this module itself should fix. (Unless more people stumble across it in the future)