localprojects / Change-By-Us

A new way to share ideas, do projects, and make our cities better
changeby.us
Other
55 stars 27 forks source link

MySQL Database Character Set and Collation #23

Closed cvee closed 12 years ago

cvee commented 12 years ago

Creating a new MySQL database results in tables having the character set latin1 and collation latin_swedish_ci. To support Unicode the database tables should have a default character set of utf8 and collation of utf8_unicode_ci.

cybertoast commented 12 years ago

This could also be done system-wide I guess.

[mysqld]
character-set-server=utf8
collation-server=utf8_unicode_ci

But I get your point about scripts/generate_models.sh.template and have committed the change to develop.

On Mon, Jan 9, 2012 at 2:53 PM, Chris Verwymeren < reply@reply.github.com

wrote:

Creating a new MySQL database results in tables having the character set latin1 and collation latin_swedish_ci. To support Unicode the database tables should have a default character set of utf8 and collation of utf8_unicode_ci.


Reply to this email directly or view it on GitHub: https://github.com/localprojects/Change-By-Us/issues/23

cvee commented 12 years ago

If sql/models.sql is used (as specified in the install instructions) to create the initial database schema, tables will continue to be set to latin1 because each CREATE TABLE block has that character set assigned.

I went through the files and replaced all references to latin1 with utf8 and added the collation. Let me know if the following commit makes sense and I'll issue a pull request:

https://github.com/iStrategyLabs/Change-By-Us/commit/626a860cf4009833dce775bc81dbd2921af47eef

cybertoast commented 12 years ago

Hi Chris, My suggestion would be to remove the charcterset and encoding completely and rely on system-wide configuration for this. Once the database is created with a default character-set and encoding all tables under it should have the same properties. It would be easier in case someone wishes to use a different encoding - they wouldn't have to make as many changes.

Just to be clear: "The database character set and collation are used as default values for table definitions if the table character set and collation are not specified in CREATE TABLE statements."

cvee commented 12 years ago

My concern with that is it requires someone to globally change their database server. It's an additional hurdle to deploying CBU (requiring someone to be familiar with MySQL configuration) and could potentially affect other databases running on that server. Embedding the settings within the CBU source ensures the platform is configured correctly without affecting other systems running on the same server.

cybertoast commented 12 years ago

Sorry, my misstatement. I meant to say database-wide rather than system-wide - i.e.,

CREATE DATABASE GAM2 DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_unicode_ci;

I believe you already had suggested this. What I'm saying is that it's not necessary on every table since the database character-set and collation will be adopted for every table in the database.

cvee commented 12 years ago

That solutions makes sense. There are places in the code that specify latin1, so in addition to assigning the settings database-wide, the code needs to have any occurrences of 'latin1' removed.

cybertoast commented 12 years ago

I agree. If you make the change and submit a pull request I'll merge everything in. Or let me know if you want me to make the change.

On Wed, Jan 11, 2012 at 4:01 PM, Chris Verwymeren < reply@reply.github.com

wrote:

That solutions makes sense. There are places in the code that specify latin1, so in addition to assigning the settings database-wide, the code needs to have any occurrences of 'latin1' removed.


Reply to this email directly or view it on GitHub:

https://github.com/localprojects/Change-By-Us/issues/23#issuecomment-3453493

cybertoast commented 12 years ago

Merged and pushed into develop.