openaustralia / morph

Take the hassle out of web scraping
https://morph.io
GNU Affero General Public License v3.0
461 stars 74 forks source link

LogLine text column can be too long causing MySQL to throw an exception #1091

Closed henare closed 8 years ago

henare commented 8 years ago
[Morph/production] Excon::Error::Socket: Mysql2::Error: Data too long for column 'text' at row 1: INSERT INTO `log_lines` (`run_id`, `timestamp`, `stream`, `text`, `created_at`, `updated_at`) VALUES (593874, '2016-10-20 05:39:48.201105', 'stdout', '    <div id=\"js-content\" class=\"height-100\"><div class=\"flex flex-column height-100\" data-reactroot=\"\" data-reactid=\"1\" data-react-checksum=\"-1818031355\"><!-- react-empty: 2 --><header class=\"base___2iOjf\" role=\"banner\" data-reactid=\"3\"><a hr...

Backtrace

line 52 of [PROJECT_ROOT]/lib/morph/runner.rb: log
line 43 of [PROJECT_ROOT]/lib/morph/runner.rb: block in go_with_logging
line 77 of [PROJECT_ROOT]/lib/morph/runner.rb: block in go

View full backtrace and more info at honeybadger.io

henare commented 8 years ago

@mlandauer can you please look at this now? It should be an easy fix and it's currently affecting 3 runs which are taking up precious slots in the queue:

https://morph.io/admin/runs/595046 https://morph.io/admin/runs/593874 https://morph.io/admin/runs/593862

mlandauer commented 8 years ago

@henare I'm going to take a look at this now

mlandauer commented 8 years ago

This was a strange one because I thought it was already fixed. It was truncating the text for the log lines to 65535 characters which should be the maximum that fits in a mysql TEXT column. Weirdly this doesn't seem to be working in practise. Don't know why. Have just done the really stupid thing and am truncating the text to 32768 characters (half the width).

So far so good and the troublesome runs have completed. Even though this solution is far from ideal (because really I don't know whether it's actually fixing it) I'm going to call this done.

henare commented 8 years ago

Thanks for fixing this Matthew.

Could it be to do with multibyte characters taking up more than "1" character in the database?

mlandauer commented 8 years ago

@henare I was wondering that. Actually could do a little test to check this out. I'll reopen.

henare commented 8 years ago

The problematic data might also still be in Honeybadger?