At the moment, nearly everything in the database seems to be latin-1. This of course causes problems with some characters, and I'm having to work around it in #113 in ways that will probably break compatibility with any non-MySQL engines.
I think the best option for this will be to handle it at the application level by specifying the character set for the engine like this, which I think will then persist everywhere? Other options (that I'm not 100% sure will work) include specifying the character set when a new database is created and setting the character set in a my.cnf file for our deployment.
NB that MySQL's default implementation of utf8 is broken since it only allows three characters, so they created a new data type, utf8mb4, to handle real utf8. I can't figure out how well-supported that is in other engines.
The trick to this is that it will require quite a bit of DB migration. In particular, any fields that currently run up against their byte limits (or indeed a quarter of them) might need to have their sizes changed so that the same number of bigger characters will fit.
At the moment, nearly everything in the database seems to be latin-1. This of course causes problems with some characters, and I'm having to work around it in #113 in ways that will probably break compatibility with any non-MySQL engines.
I think the best option for this will be to handle it at the application level by specifying the character set for the engine like this, which I think will then persist everywhere? Other options (that I'm not 100% sure will work) include specifying the character set when a new database is created and setting the character set in a my.cnf file for our deployment.
NB that MySQL's default implementation of utf8 is broken since it only allows three characters, so they created a new data type, utf8mb4, to handle real utf8. I can't figure out how well-supported that is in other engines.
The trick to this is that it will require quite a bit of DB migration. In particular, any fields that currently run up against their byte limits (or indeed a quarter of them) might need to have their sizes changed so that the same number of bigger characters will fit.