Closed dleehr closed 10 years ago
They should all use UTF8. Otherwise it's just too easy to run into all kinds of character encoding conversion issues.
Agreed. The latin1
encodings (and matching collation settings) are legacy stuff from @pdpolly 's early work on the database. I'll see what I can do about moving everything over to utf8
without mangling the existing text.
As it turns out, managing character sets and collation in MySQL is a nightmare. But now I think I have the database, servers, and HTML metatags all in agreement about using UTF-8. Your grep
test above comes up empty in the latest data dump (note the new filename):
# grep -i "charset=latin" FossilCalibration_full_UTF8.sql
If you instead grep for latin
, you'll see lots of chatter, but I believe these are no-ops that will not affect the data on import.
If the new database works well for you, let me know and I'll generate the skel
and other versions in the same way.
If the new database works well for you, let me know and I'll generate the skel and other versions in the same way.
Update: I've successfully restored the previous dump on our new dev server. I did need to reset the database's default collation to keep everything consistent:
ALTER DATABASE FossilCalibration CHARACTER SET utf8 COLLATE utf8_general_ci ;
The next dump will have this preset. We're still making minor changes to the data model, so I'll wait on the next skeleton database until things have settled down.
Is this completed, then?
I believe so, I just try not to close issues that I didn't create. @dleehr, are you still seeing any mangled characters?
Yeah this is good now
While doing some API work I noticed that some tables use latin1, but others are utf8. Views that join the two result in mangled character encodings (at least in my queries)