publiclab / plots2

a collaborative knowledge-exchange platform in Rails; we welcome first-time contributors! :balloon:
https://publiclab.org
GNU General Public License v3.0
958 stars 1.83k forks source link

Error on new wiki page creation with emoji #2665

Open jywarren opened 6 years ago

jywarren commented 6 years ago

@NiklasJordan said:

yes I tried that too ;-) Thanks to create this page for me.

But for this issue I go to https://publiclab.org/wiki/first-contribution

bildschirmfoto 2018-04-27 um 17 07 51

Then I put my content on it...

bildschirmfoto 2018-04-27 um 17 08 28

...press "publish" and get this error page:

bildschirmfoto 2018-04-27 um 17 09 10

I am off for the next two hours. But I try to check my phone for your response! Hope this help to find the bug.


Let's try to reproduce this -- @Gauravano has already. I'll look through logs but we should try to make a functional test that catches this...

Thanks all!

jywarren commented 6 years ago

@Gauravano did any text cause this? specific text?

jywarren commented 6 years ago

Logs!

Completed 500 Internal Server Error in 33ms (ActiveRecord: 7.8ms)

ActiveRecord::StatementInvalid (Mysql2::Error: Incorrect string value: '\xF0\x9F\x
8E\x89\xF0\x9F...' for column 'body' at row 1: INSERT INTO `node_revisions` (`nid`
, `uid`, `title`, `body`, `teaser`, `log`, `timestamp`, `format`) VALUES (16259, 5
15081, 'first-contribution', '## Whoop! Whoop! Yay... <U+1F389><U+1F389><U+1F389>\
\<br /\\>**Your first contribution on Public Lab**\r\n\\<br /\\>\r\n\\<div class=\
"alert alert-success\"\\>\\<h4\\>Thank you so much for your support and being part
 of our community!\\</h4\\>\\<p\\>As a first time website contributor, your questi
on was held in moderation. In fact, one of our moderators have to approve your con
tribution before it is public. Sorry, for this \"human loop\", in the next hours y
our contribution should be public.\\</p\\>\\</div\\>\r\n\r\n---\r\n\r\n### What ar
e the next steps?\r\n\r\nPlease help us figure out what happened so we can fix it!
\r\n1. Your post will be checked by one of our moderators. This will only take a f
ew hours.\r\n2. Then your post will be public and everyone can see and interact wi
th it.\r\n3. Now everything is done and you can be a real part of the great Public
 Lab community.\r\n\r\n---\r\n\r\n### Be a part of the community!\r\n\r\nHere are 
a few ideas to be a part of these great community:\r\n\r\n- **Join Open Call**\\<b
r /\\>If you\'re interested in talking to others and brainstorming how to start a 
project or where to plug in on something ongoing, this is a great place to start. 
There is one every Tuesday at 7pm GMT.\r\n\r\n- **Post an Issue Brief**\\<br /\\>I
s there an environmental issue you and your group are already thinking about local
ly? Posting an Issue Brief is a good way to get started sharing and engaging other
s on that issue.\r\n\r\n- **Post a question**\\<br /\\>Already have something in m
ind you\'d like to ask and explore with others? The Q&A is a great easy way to get
 started sharing.\r\n\r\n---\r\n\r\n### Ask for help in the Public Lab chatroom\r\
n\r\nCommunity members and staff may be able to help you in real time.\r\n\r\n\\<a
 class=\"btn btn-primary\" href=\"https://chat.publiclab.org/\"\\>Open chatroom\\<
/a\\>', '', '', 1524843859, 1)):
  app/controllers/wiki_controller.rb:187:in `block in update'
  app/controllers/wiki_controller.rb:186:in `update'
grvsachdeva commented 6 years ago

@jywarren only text used by @NiklasJordan is causing this issue if we are using plain text no issue is encountered. By viewing logs and text, I guess it's clear that body is not processed. Also, markdown in content seems bad to me. What do you think @jywarren?

jywarren commented 6 years ago

Maybe if we can get the full text we can try filtering out the sections that are causing trouble...

On Fri, Apr 27, 2018, 1:14 PM Gaurav Sachdeva notifications@github.com wrote:

@jywarren https://github.com/jywarren only text used by @NiklasJordan https://github.com/NiklasJordan is causing this issue if we are using plain text no issue is encountered. By viewing logs and text, I guess it's clear that body is not processed. Also, markdown in content seems bad to me. What do you think @jywarren https://github.com/jywarren?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/plots2/issues/2665#issuecomment-385035390, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJ8opNnujVo1hW1LQZBhh8gCdo5gtks5ts1G7gaJpZM4Tqp__ .

grvsachdeva commented 6 years ago

ok, I will try this

jywarren commented 6 years ago

@NiklasJordan - can you send us the full text in a Gist using http://gist.github.com so we can try to post it? It's strange but there seem to be characters in the text causing the app to choke... not your fault but we'd like to try to reproduce it to fix it! Thanks for your help with this!

NiklasJordan commented 6 years ago

Hej @jywarren, sorry for the late answer. Sure, I've created the gist here: https://gist.github.com/NiklasJordan/da02086827b6f01c0429bacf8cc5ea96

jywarren commented 6 years ago

awesome, thank you!!!

On Tue, May 1, 2018 at 10:38 AM, Niklas Jordan notifications@github.com wrote:

Hej @jywarren https://github.com/jywarren, sorry for the late answer. Sure, I've created the gist here: https://gist.github.com/NiklasJordan/ da02086827b6f01c0429bacf8cc5ea96

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/plots2/issues/2665#issuecomment-385686468, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJyWtwu-qshcswXAv6n-7fvBW6LTbks5tuHNSgaJpZM4Tqp__ .

namangupta01 commented 6 years ago

I got it why this is happening. This is happening because of the using of emojis. Emojis take four byte to store data in mysql and is of utf8mb4 encoding which is a utf8 that take 4 byte. Normally by default it uses 3 byte to store data and use normal utf8 encoding. We can use utf8mb4 for this particular table to support emojis.

jywarren commented 6 years ago

Great investigative work! @icarito - is there any downside to changing the encoding of the revisions table to accommodate emojis?

In the meantime, we could replace the emojis with :smile : (space added so it doesn't actually form an emoji) format strings, which are auto-replaced (due to a recent new feature) with the emoji.

OR we could try to auto-filter emojis and convert them to their :smile : format strings, and not change the table? Pros, cons?

jywarren commented 6 years ago

@icarito any thoughts here -- thank you!

icarito commented 6 years ago

Sorry I missed this question! Can this be done in a regular migration? We should definitively test this in staging or unstable instances.

jywarren commented 6 years ago

I think it could be. But would you prefer to bypass it by filtering and replacing with the :____: style emoji? Pros/cons from a storage or maintenance perspective?

On Thu, May 10, 2018 at 2:47 PM, Sebastian Silva notifications@github.com wrote:

Sorry I missed this question! Can this be done in a regular migration? We should definitively test this in staging or unstable instances.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/plots2/issues/2665#issuecomment-388148167, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfJ1dP73csm-ygLJyNVi_gG0k60nsmks5txIslgaJpZM4Tqp__ .

icarito commented 6 years ago

I think it makes the most sense to switch to UTF-8, Unicode (utf8_unicode_ci). This should accommodate most languages if not all.

But - migrating could break characters if they are not correctly encoded - could lead to mangling some characters at migration time. After migration in unstable we should look for pages with international characters, names etc.

Storage and maintenance should be about the same.

namangupta01 commented 6 years ago

Yes we can do this through migration. If you want i can give it a shot.

jywarren commented 6 years ago

@namangupta01 - yes, please - can you try opening a PR for this and we can force push it to unstable and look through a lot of pages to see what happens?

@NiklasJordan i'm sorry this is taking so long! With @Gauravano and @namangupta01 and @icarito's sleuthing, we can now get a non-emoji version up:

https://publiclab.org/first-contribution

If you'd like to make an edit to that page to get credit, that'd be awesome :-)

I'll check back on the last issue now!

namangupta01 commented 6 years ago

While trying to reproducing this on my local machine i am getting no error but i am also not getting any text after the emojis in the page i.e all the content after the emojis vanishes. ---Detecting why this is occuring.

jywarren commented 6 years ago

https://github.com/railsmachine/utf8mb4_conversion_scripts has a possible solution!

jywarren commented 6 years ago

It's pretty thorough, and somewhat involved. I think we should look into it because it may also solve #2209

@icarito, after the memory-exhaustion issue in #2824 , would you mind looking at this and attempting a migration of unstable to see how hard this would be?

namangupta01 commented 6 years ago

Hi @jywarren I am working today on this will create a pr for this for comment today.

jywarren commented 5 years ago

Thinking that this may be fixed in https://github.com/publiclab/plots2/pull/3007 and/or with a follow-up applying this to nodes.