veekun / pokedex

more than you ever wanted to know about Pokémon
MIT License
1.44k stars 637 forks source link

Charset error when importing to MySQL #232

Closed itsjavi closed 2 years ago

itsjavi commented 6 years ago

Following the instructions of the Wiki, I tried to load the DB into MySQL

bin/pokedex load -e "mysql+pymysql://${MYSQL_USER}:${MYSQL_PASSWORD}@${MYSQL_HOST}/${MYSQL_DATABASE}"

But I get this error: UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-5: ordinal not in range(256)

move_battle_style_prose...                                           ok
conquest_move_effect_prose...                                        ok
Traceback (most recent call last):
  File "bin/pokedex", line 11, in <module>
    load_entry_point('Pokedex', 'console_scripts', 'pokedex')()
  File "/repo/pokedex/main.py", line 27, in setuptools_entry
    main(*sys.argv)
  File "/repo/pokedex/main.py", line 23, in main
    args.func(parser, args)
  File "/repo/pokedex/main.py", line 249, in command_load
    langs=langs,
  File "/repo/pokedex/db/load.py", line 317, in load
    insert_and_commit()
  File "/repo/pokedex/db/load.py", line 257, in insert_and_commit
    session.execute(insert_stmt, new_rows)
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/orm/scoping.py", line 153, in do
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1176, in execute
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 948, in execute
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1200, in _execute_context
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1416, in _handle_dbapi_exception
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1170, in _execute_context
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/dialects/mysql/mysqldb.py", line 105, in do_executemany
  File "/repo/lib/python2.7/site-packages/pymysql/cursors.py", line 192, in executemany
    self._get_db().encoding)
  File "/repo/lib/python2.7/site-packages/pymysql/cursors.py", line 208, in _do_execute_many
    v = values % escape(next(args), conn)
  File "/repo/lib/python2.7/site-packages/pymysql/cursors.py", line 121, in _escape_args
    (key, val) in args.items())
  File "/repo/lib/python2.7/site-packages/pymysql/cursors.py", line 121, in <genexpr>
    (key, val) in args.items())
  File "/repo/lib/python2.7/site-packages/pymysql/cursors.py", line 106, in _ensure_bytes
    x = x.encode(encoding)
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-5: ordinal not in range(256)

Any ideas? It seems that the used MySQL connection used from this project is not UTF-8?

itsjavi commented 6 years ago

I changed the mysql configuration to set the default charset to:

[client]
default-character-set=utf8

[mysql]
default-character-set=utf8

[mysqld]
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8

but now I get a different error in move_flag_prose: "Incorrect string value: '\\xFChrt d...' for column 'description' at row 1") [SQL: u'INSERT INTO move_flag_prose (move_flag_id, local_language_id, name, description) VALUES


pokemon_habitats...                                                  ok
regions...                                                           ok
super_contest_effects...                                             ok
conquest_move_ranges...                                              ok
item_fling_effects...                                                ok
Traceback (most recent call last):
  File "bin/pokedex", line 11, in <module>
    load_entry_point('Pokedex', 'console_scripts', 'pokedex')()
  File "/repo/pokedex/main.py", line 27, in setuptools_entry
    main(*sys.argv)
  File "/repo/pokedex/main.py", line 23, in main
    args.func(parser, args)
  File "/repo/pokedex/main.py", line 249, in command_load
    langs=langs,
  File "/repo/pokedex/db/load.py", line 317, in load
    insert_and_commit()
  File "/repo/pokedex/db/load.py", line 257, in insert_and_commit
    session.execute(insert_stmt, new_rows)
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/orm/scoping.py", line 153, in do
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/orm/session.py", line 1176, in execute
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 948, in execute
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1200, in _execute_context
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1413, in _handle_dbapi_exception
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/util/compat.py", line 203, in raise_from_cause
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/engine/base.py", line 1170, in _execute_context
  File "/repo/lib/python2.7/site-packages/SQLAlchemy-1.2.5-py2.7-linux-x86_64.egg/sqlalchemy/dialects/mysql/mysqldb.py", line 105, in do_executemany
  File "/repo/lib/python2.7/site-packages/pymysql/cursors.py", line 192, in executemany
    self._get_db().encoding)
  File "/repo/lib/python2.7/site-packages/pymysql/cursors.py", line 229, in _do_execute_many
    rows += self.execute(sql + postfix)
  File "/repo/lib/python2.7/site-packages/pymysql/cursors.py", line 165, in execute
    result = self._query(query)
  File "/repo/lib/python2.7/site-packages/pymysql/cursors.py", line 321, in _query
    conn.query(q)
  File "/repo/lib/python2.7/site-packages/pymysql/connections.py", line 860, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/repo/lib/python2.7/site-packages/pymysql/connections.py", line 1061, in _read_query_result
    result.read()
  File "/repo/lib/python2.7/site-packages/pymysql/connections.py", line 1349, in read
    first_packet = self.connection._read_packet()
  File "/repo/lib/python2.7/site-packages/pymysql/connections.py", line 1018, in _read_packet
    packet.check_error()
  File "/repo/lib/python2.7/site-packages/pymysql/connections.py", line 384, in check_error
    err.raise_mysql_exception(self._data)
  File "/repo/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
    raise errorclass(errno, errval)
sqlalchemy.exc.InternalError: (pymysql.err.InternalError) (1366, u"Incorrect string value: '\\xFChrt d...' for column 'description' at row 1") [SQL: u'INSERT INTO move_flag_prose (move_flag_id, local_language_id, name, description) VALUES (%(move_flag_id)s, %(local_language_id)s, %(name)s, %(description)s)'] [parameters: ({'local_language_id': u'6', 'move_flag_id': u'1', 'description': u'Der Anwender ber\xfchrt das Ziel.  Dies l\xf6st mahcne F\xe4higkeiten aus (z.B. []{ability:static}) und Gegenst\xe4nde (z.B. []{item:sticky-barb}).', 'name': u'Kontakt'}, {'local_language_id': u'9', 'move_flag_id': u'1', 'description': u'User touches the target.  This triggers some abilities (e.g., []{ability:static}) and items (e.g., []{item:sticky-barb}).', 'name': u'Makes contact'}, {'local_language_id': u'6', 'move_flag_id': u'2', 'description': u'Diese F\xe4higkeit hat eine Aufladerunde, welche mit einer []{item:power-herb} \xfcbersprungen werden kann.', 'name': u'Hat eine Aufladerunde'}, {'local_language_id': u'9', 'move_flag_id': u'2', 'description': u'This move has a charging turn that can be skipped with a []{item:power-herb}.', 'name': u'Has a charging turn'}, {'local_language_id': u'6', 'move_flag_id': u'3', 'description': u'Die Runde nach dieser F\xe4higkeit wird ben\xf6tigt, damit das Pok\xe9mon sich aufladen kann.', 'name': u'Muss aufladen'}, {'local_language_id': u'9', 'move_flag_id': u'3', 'description': u"The turn after this move is used, the Pok\xe9mon's action is skipped so it can recharge.", 'name': u'Must recharge'}, {'local_language_id': u'6', 'move_flag_id': u'4', 'description': u'Diese F\xe4higkeit funktioniert nicht, wenn das Ziel []{move:detect} oder []{move:protect} diese Runde genutzt hat.', 'name': u'Blockiert von Scanner and Schutzschild'}, {'local_language_id': u'9', 'move_flag_id': u'4', 'description': u'This move will not work if the target has used []{move:detect} or []{move:protect} this turn.', 'name': u'Blocked by Detect and Protect'}  ... displaying 10 of 41 total bound parameter sets ...  {'local_language_id': u'9', 'move_flag_id': u'20', 'description': u'This move is unusable during Sky Battles.', 'name': u'Unusable during Sky Battles'}, {'local_language_id': u'9', 'move_flag_id': u'21', 'description': u'This move triggers []{ability:dancer}.', 'name': u'Dance'})] (Background on this error at: http://sqlalche.me/e/2j85)
move_flag_prose...                                                   
route1rodent commented 6 years ago

I had the same problem recently and I had to fix the CSV manually. I think that CSV files should not be changed directly because people can make these mistakes. It is the DB that should dump the CSV and updates to the DB should come in form of SQL scripts (migrations) instead of changing the CSV files.... but that's another topic.

Bingbazaar commented 6 years ago

I found a solution that fixed the above error for me:

pokedex load -e mysql+pymysql://username:password@localhost/database?charset=utf8

Yay!

rluzuriaga commented 4 years ago

If anyone has this issue and the above comments didn't solve it, here are the steps I just used to solve it. Also see #192

Enter the mysql terminal and enter:

SELECT default_character_set_name FROM information_schema.SCHEMATA S WHERE schema_name = "DBNAME";

Change DBNAME to the database name you had created. This was my output:

+----------------------------+
| default_character_set_name |
+----------------------------+
| latin1                     |
+----------------------------+
1 row in set (0.00 sec)

Now enter this command changing DBNAME to your database name again:

ALTER DATABASE DBNAME CHARACTER SET utf8 COLLATE utf8_general_ci;

After I did that I was able to load using pokedex load -e mysql+pymysql://admin:password@localhost/database