ualbertalib / avalon

University of Alberta's Media Repository based on Avalon
Apache License 2.0
2 stars 2 forks source link

`edit structure` operation fails for MediaObject `mk61rj10w ` #748

Closed jefferya closed 3 years ago

jefferya commented 3 years ago

HTTP 500 error https://era-av.library.ualberta.ca/media_objects/mk61rj10w/edit?step=structure

method=GET path=/media_objects/mk61rj10w/edit format=html controller=MediaObjectsController action=edit status=500 error='ActionView::Template::Error: incompatible character encodings: UTF-8 and ASCII-8BIT' duration=4506.83 view=0.00 db=3.28 time=2494125.58

ActionView::Template::Error (incompatible character encodings: UTF-8 and ASCII-8BIT):
    94:             <% end %>
    95:           </div>
    96:           <% if section.captions.present? %>
    97:             <div class="structure_view">Uploaded file: <%= section.captions.original_name %></div>
    98:           <% end %>
    99:         </div>
   100:       </li>

app/views/media_objects/_structure.html.erb:97:in `block in _app_views_media_objects__structure_html_erb__763344923819196357_70342538211520'
app/views/media_objects/_structure.html.erb:27:in `each'
app/views/media_objects/_structure.html.erb:27:in `each_with_index'
app/views/media_objects/_structure.html.erb:27:in `_app_views_media_objects__structure_html_erb__763344923819196357_70342538211520'
app/views/media_objects/edit.html.erb:28:in `_app_views_media_objects_edit_html_erb__2877677242272686410_70342657493840'

Also, a second test: https://era-av.library.ualberta.ca/media_objects/mk61rj10w.json

method=GET path=/media_objects/mk61rj10w.json format=json controller=MediaObjectsController action=show status=500 error='Encoding::UndefinedConversionError: "\xEF" from ASCII-8BIT to UTF-8' duration=23897.41 view=0.00 db=14.62 time=2494148.80

Encoding::UndefinedConversionError ("\xEF" from ASCII-8BIT to UTF-8):

app/controllers/media_objects_controller.rb:339:in `block (2 levels) in show'
app/controllers/media_objects_controller.rb:328:in `show'
app/controllers/application_controller.rb:78:in `handle_api_request'
jefferya commented 3 years ago

The first test, edit structure appears to be related to the caption original name (not the contents of the caption). ToDo 2021-05-21 test & verification needed. This filename contain problematic right quote character: 4-1-Synthese-et-evaluation-de-l’information (1).srt. One option to fix:

a = MasterFile.find('media_file_id')
a.captions.content = '' # fix filename error, need to remove content & change filename metadata otherwise save doesn't update
a.captions.original_name = 'a.srt'
a.save

The problematic caption: https://era-av.library.ualberta.ca/master_files/k06988552/captions

The second, json, error appears to be in captions attached to Masterfiles.

Steps to reproduce

z = MediaObject.find('mk61rj10w');
y = z.as_json;
y.to_json                                             # Encoding::UndefinedConversionError ("\xEF" from ASCII-8BIT to UTF-8)
y[:files][0][:captions].to_json                   # Encoding::UndefinedConversionError ("\xEF" from ASCII-8BIT to UTF-8)
y[:files][0].to_json(:except => :captions) # succeeds
y.to_json(:except => :captions)               # succeeds

To download caption files https://era-av.library.ualberta.ca/master_files/${ID}/captions

jefferya commented 3 years ago

Fixed the filename problem with the aforementioned approach.

The JSON representation of the media object fails if the caption file contains a byte-order mark or non UTF-8 characters. E.G., path=/media_objects/mk61rj10w.json

jefferya commented 3 years ago

Fixed the filename problem with the aforementioned approach.

The JSON representation of the media object fails if the caption file contains a byte-order mark or non UTF-8 characters. E.G., path=/media_objects/mk61rj10w.json. Unused except by transition; will address in #753