mysociety / alaveteli

Provide a Freedom of Information request system for your jurisdiction
https://alaveteli.org
Other
389 stars 196 forks source link

[ERROR] script/parse-unparsed-pro "can't modify frozen String" #5028

Open garethrees opened 5 years ago

garethrees commented 5 years ago
rake aborted!
can't modify frozen String
/app/models/foi_attachment.rb:70:in `force_encoding'
/app/models/foi_attachment.rb:70:in `body='
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/attribute_assignment.rb:54:in `public_send'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/attribute_assignment.rb:54:in `_assign_attribute'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/attribute_assignment.rb:41:in `block in assign_attributes'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/attribute_assignment.rb:35:in `each'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/attribute_assignment.rb:35:in `assign_attributes'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/persistence.rb:251:in `block in update'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/transactions.rb:351:in `block in with_transaction_returning_status'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/connection_adapters/abstract/database_statements.rb:211:in `transaction'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/transactions.rb:220:in `transaction'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/transactions.rb:348:in `with_transaction_returning_status'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/persistence.rb:250:in `update'
/app/models/incoming_message.rb:540:in `block in extract_attachments!'
/app/models/incoming_message.rb:538:in `each'
/app/models/incoming_message.rb:538:in `extract_attachments!'
/app/models/incoming_message.rb:98:in `block in parse_raw_email!'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in `block in transaction'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/connection_adapters/abstract/transaction.rb:184:in `within_new_transaction'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in `transaction'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/transactions.rb:220:in `transaction'
/app/models/incoming_message.rb:97:in `parse_raw_email!'
/lib/tasks/incoming_messages.rake:8:in `block (4 levels) in <top (required)>'
/lib/tasks/incoming_messages.rake:8:in `block (3 levels) in <top (required)>'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `block (2 levels) in find_each'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `each'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:51:in `block in find_each'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:124:in `find_in_batches'
/vendor/bundle/ruby/2.3.0/gems/activerecord-4.2.10/lib/active_record/relation/batches.rb:50:in `find_each'
/lib/tasks/incoming_messages.rake:7:in `block (2 levels) in <top (required)>'
/vendor/bundle/ruby/2.3.0/gems/rake-12.3.1/exe/rake:27:in `<top (required)>'
/opt/rbenv/versions/2.3.8/bin/bundle:22:in `load'
/opt/rbenv/versions/2.3.8/bin/bundle:22:in `<main>'
Tasks: TOP => incoming_messages:parse_unparsed_pro
(See full trace by running task with --trace)
garethrees commented 5 years ago

Linking to the raw email for reference.

This at least prevents the problem:

diff --git a/app/models/foi_attachment.rb b/app/models/foi_attachment.rb
index 71caaf345..38885f4ab 100644
--- a/app/models/foi_attachment.rb
+++ b/app/models/foi_attachment.rb
@@ -67,7 +67,7 @@ class FoiAttachment < ActiveRecord::Base
     update_display_size!
     @cached_body = d
     if String.method_defined?(:encode)
-      @cached_body = @cached_body.force_encoding("ASCII-8BIT")
+      @cached_body = @cached_body.dup.force_encoding("ASCII-8BIT")
     end
   end

It looks like the string is frozen when extracted…

m = IncomingMessage.last
attrs = MailHandler.get_attachment_attributes(m.raw_email.mail).select { |h| h[:hexdigest] == "d41d8cd98f00b204e9800998ecf8427e" }.first

get_part_body frozen? false
get_part_body frozen? true
get_part_body frozen? true
=> {:url_part_number=>2,
 :content_type=>"application/pdf",
 :filename=>"FOI 18-03028 Resp.pdf",
 :charset=>nil,
 :within_rfc822_subject=>nil,
 :body=>"",
 :hexdigest=>"d41d8cd98f00b204e9800998ecf8427e"}

Yep, Mail::Body#decoded returns a frozen String, so I think this is something that we're not handling correctly.

From: /home/vagrant/alaveteli/lib/mail_handler/backends/mail_backend.rb @ line 86 MailHandler::Backends::MailBackend#get_part_body:

    84: def get_part_body(part)
    85:   binding.pry
 => 86:   decoded = part.body.decoded
    87:   puts "get_part_body frozen? #{decoded.frozen?}"
    88:   if part.content_type =~ /^text\//
    89:     decoded = convert_string_to_utf8_or_binary decoded, part.charset
    90:   end
    91:   decoded
    92: end

part.body.decoded.frozen?
# => true

part.body.class
# => Mail::Body
garethrees commented 5 years ago

Monkeypatched this change in the console and parsed the raw email which was successful. Looks like the attached PDF is corrupt though, as its showing as 0KB and I can't even see it if I open the raw email in Apple Mail.