clear-code / redmine_full_text_search

Full text search for Redmine
MIT License
61 stars 24 forks source link

full_text_search:synchronize failed when using emoji #76

Closed y503unavailable closed 4 years ago

y503unavailable commented 4 years ago

If the ticket description contains emoji, it will not be saved in fts_targets in my environment.

When the character code of the emoji is not included, the synchronize ends normally and is searched normally. If you save it including the character code of the emoji, the ticket will not be searched.

I wrote the following character codes in redmine description field.

🐳 (Whale) 🐳

Are there any restrictions or settings when using emoji?

Environment CENTOS7.7 /MariaDB10.4 /Redmine4.0.5 devel MariaDB is set to utf8mb4.

case 1. Synchronize

RAILS_ENV=production bin/rails full_text_search:synchronize l FullTextSearch::Target:All [= ] 14051/112601(12%) 28m59s 56.69/s 4m 8s mqq Journal:New [===========================] 4133/4133(100%) 0s 57.36/s 1m 0s .. mqq Issue:New [======= ] 9918/36170(27%) 8m15s 56.69/s 3m 7s rails aborted! ActiveRecord::StatementInvalid: Mysql2::Error: Incorrect string value: '\xF0\x9F\x90\xB3 a...' for column redmine.fts_targets.content at row 1: INSERT INTO fts_targets (source_id, source_type_id, project_id, is_private, last_modified_at, title, content, tag_ids) VALUES (80086, 2, 21, FALSE, '2018-12-16 00:40:12', 'redmine cannot deal with 4-byte utf-8 characters', 'When entering an issue description or a note, all text after a 4-byte utf-8 character seems to get truncated.\n\nTo reproduce just create a new issue\nin the description enter some text and a 4-byte utf-8 such a ? 梶nd

case 2. Issue description edit/save

No error is displayed, but the following log is recorded in production.log.

[ActiveJob] [FullTextSearch::UpsertTargetJob] [dda2755c-5a71-4e33-9d47-7922e0f644c8] Error performing FullTextSearch::UpsertTargetJob (Job ID: dda2755c-5a71-4e33-9d47-7922e0f644c8) from Async(full_text_search) in 85.86ms: ActiveRecord::StatementInvalid (Mysql2::Error: Incorrect string value: '\xF0\x9F\x90\xB3' for column redmine.fts_targets.content at row 1:

y503unavailable commented 4 years ago

Environment

CENTOS7.7 /MariaDB10.4 /Redmine4.0.5 devel

Full Text Search plugin 1.0.4

yum list

MariaDB-client.x86_64 10.4.8-1.el7.centos @mariadb-main MariaDB-common.x86_64 10.4.8-1.el7.centos @mariadb-main MariaDB-compat.x86_64 10.4.8-1.el7.centos @mariadb-main MariaDB-devel.x86_64 10.4.10-1.el7.centos @mariadb-main MariaDB-server.x86_64 10.4.8-1.el7.centos @mariadb-main MariaDB-shared.x86_64 10.4.8-1.el7.centos @mariadb-main mariadb-10.4-mroonga.x86_64 9.09-1.el7 @groonga-centos roonga-libs.x86_64 9.0.9-1.el7 @groonga-centos groonga-normalizer-mysql.x86_64 1.1.3-1.el7 @groonga-centos groonga-release.noarch 1.5.2-1 @/groonga-release-latest.noarchDlcbad

MariaDB [redmine]> show variables like "char%"; +--------------------------+----------------------------+ | Variable_name | Value | +--------------------------+----------------------------+ | character_set_client | utf8mb4 | | character_set_connection | utf8mb4 | | character_set_database | utf8mb4 | | character_set_filesystem | binary | | character_set_results | utf8mb4 | | character_set_server | utf8mb4 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--------------------------+----------------------------+ 8 rows in set (0.001 sec)

MariaDB [redmine]> show variables like 'coll%'; +----------------------+--------------------+ | Variable_name | Value | +----------------------+--------------------+ | collation_connection | utf8mb4_general_ci | | collation_database | utf8mb4_general_ci | | collation_server | utf8mb4_general_ci | +----------------------+--------------------+ 3 rows in set (0.001 sec)

MariaDB [redmine]> show create database redmine; +----------+---------------------------------------------------------------------+ | Database | Create Database | +----------+---------------------------------------------------------------------+ | redmine | CREATE DATABASE redmine /!40100 DEFAULT CHARACTER SET utf8mb4 / | +----------+---------------------------------------------------------------------+

show create table issues; .. ) ENGINE=InnoDB AUTO_INCREMENT=107330 DEFAULT CHARSET=utf8mb4 |

kou commented 4 years ago

Do you have encoding: utf8mb4 in your config/database.yml?

y503unavailable commented 4 years ago

Thank you for the advice.

As you pointed out, When redmine database.yml is changed to encoding utf8mb4, We were able to successfully search for issues with emoji.

Thank you for publishing great software.