MushroomObserver / mushroom-observer

A website for sharing observations of mushrooms.
https://mushroomobserver.org
MIT License
77 stars 25 forks source link

Emojis in Comments #2077

Open JoeCohen opened 4 months ago

JoeCohen commented 4 months ago

Emojis throw errors when creating Comments. The code block below shows what happened locally when I clicked Create after putting an emoji in a Summary. The same thing happens for emojis in the Comment body. Not sure what to do about this:

Application Trace | Framework Trace | Full Trace Request Parameters:

{"utf8"=>"✓", "authenticity_token"=>"[FILTERED]", "comment"=>{"summary"=>"asdf🥸", "comment"=>""}, "commit"=>"Create", "q"=>"1oaSB", "target"=>"547363", "type"=>"Observation"}

JoeCohen commented 3 months ago

@mo-nathan suggests migrating the entire database to use utf8mb4 and also standardize the collation on utf8mb4_0900_ai_ci, which are the current Rails standards. See Slack General Discussion

mo-nathan commented 3 months ago

I looked a bit more at this and I don't think there's direct support for this in the rails migration framework, but I think it just needs to run some direct SQL per this page: https://stackoverflow.com/questions/8906813/how-to-change-the-default-charset-of-a-mysql-table.

Probably should be written as a migration that has an up and down and runs the appropriate ALTER TABLE queries. It's possible that this has to be a one way migration since it might throw an error trying to switch from utf8mb4 to utf8mb3.

JoeCohen commented 3 months ago

Before I read the above comment, CoPIlot suggested:

def self.up
  execute "ALTER DATABASE `#{ActiveRecord::Base.connection.current_database}` CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;"
end

But:

  1. per @mo-nathan's comment above, the migration should instead be on a per-table basis, affecting only the tables which are not on utf8mb4 or utf8mb4_0900_ai_ci.
  2. Do we have to use CONVERT CHARACTER SET instead of CHARACTER SET?
mo-nathan commented 3 months ago

Yes, I think we need "CONVERT CHARACTER SET". The point is that we are changing the table not the database (ALTER TABLE vs. ALTER DATABASE). The CoPilot suggestion would only change the default for new tables which are already getting created correctly with the current Rails default of utf8mb4/utf8mb4_0900_ai_ci.

JoeCohen commented 3 months ago

Thanks! Maybe I'll deal with this later. ("later" = some indefinite future date.) It's really low priority for me. But at least we have a record of how to do it.