Islandora / documentation

Contains islandora's documentation and main issue queue.
MIT License
103 stars 71 forks source link

Files with square brackets in their file names don't persist to Fedora #1309

Closed seth-shaw-unlv closed 4 years ago

seth-shaw-unlv commented 4 years ago

I attempted to create a new media using a file with brackets in the filename test [with brackets].mp3. The Media upload form didn't complain and Drupal stated that the Media was created. Drupal will show the media page (once you fix #1308); however the Fedora binary resource does not exist.

I'm not seeing any messages that the file failed to be loaded into Fedora (I've checked Drupal's logs and Catalina); it simply complains when it can't find it.

seth-shaw-unlv commented 4 years ago

Ah, I didn't look far enough back in catalina.out:

14-Oct-2019 17:14:56.803 INFO [http-nio-8080-exec-10] ca.islandora.syn.valve.SynValve.doAuthentication Site verified: https://localhost:8000
ERROR 17:14:56.804 (RepositoryExceptionMapper) Caught a repository exception: Invalid index, "with brackets", in segment name: test [with brackets].mp3

Unfortunately, the error message doesn't help much. So, an error was reported, but I'm not sure who dropped the ball. Perhaps the flysystem adapter...

seth-shaw-unlv commented 4 years ago

Looks like the form (http://localhost:8000/media/add/audio?_wrapper_format=drupal_ajax&ajax_form=1&element_parents=field_media_audio_file%2Fwidget%2F0) is getting an error:

The file permissions could not be set on fedora://2019-10/test [with brackets].mp3.

Just ignoring it.

seth-shaw-unlv commented 4 years ago

Flysystem is passing the namespaced path to Chullo (e.g. 'fedora://2019-10/test [with brackets].mp3'). Chullo is trying to use the same URI when POSTing the new resource.

So, who's job is it to check valid file naming? Flysystem or Chullo? Chullo seems to be just a fairly thin API wrapper. I guess we need the Flysystem Adapter to enforce the file-naming business logic...

seth-shaw-unlv commented 4 years ago

Huh, Gemini seems to have a URL-encoded version of the Drupal URI, but not the Fedora one:

MariaDB [gemini]> select drupal_uri,fedora_uri from Gemini where drupal_uri like '%brackets%';;
+----------------------------------------------------------------------------------+--------------------------------------------------------------------+
| drupal_uri                                                                       | fedora_uri                                                         |
+----------------------------------------------------------------------------------+--------------------------------------------------------------------+
| http://localhost:8000/_flysystem/fedora/2019-10/test%20%5Bwith%20brackets%5D.mp3 | http://127.0.0.1:8080/fcrepo/rest/2019-10/test [with brackets].mp3 |
+----------------------------------------------------------------------------------+--------------------------------------------------------------------+
1 row in set (0.00 sec)
whikloj commented 4 years ago

I think Flysystem should acknowledge that the save failed and report back why it failed. It shouldn't need to know about the underlying issues but it shouldn't mask them.

seth-shaw-unlv commented 4 years ago

It appears that Flysystem is reporting back an error. It is the file upload form that is ignoring the error, so this may have to be a core issue/patch. Unfortunately, I need to turn my attention to some other work tasks but I'll hopefully circle back to this sooner rather than later.

dflitner commented 4 years ago

This is a core issue. There are a couple of workarounds but they will require a bunch of testing. See: https://www.drupal.org/project/imagemagick/issues/1502924#comment-13222054

dannylamb commented 4 years ago

A third party module does the trick! https://www.drupal.org/project/transliterate_filenames

It's a fix to use while we wait for it to be adopted into core (may take a while).

I've tested it and it totally works and does exactly what we want. Sanitizes filenames on the way in and doesn't let you put in crazy characters, etc... Good stuff!

dannylamb commented 4 years ago

https://github.com/Islandora-Devops/islandora-playbook/pull/166 is ready to test with the fix. If you like it, merge https://github.com/Islandora/islandora/pull/761 and delete the issue-1309 branches of islandora_defaults and islandora-playbook.

dannylamb commented 4 years ago

Resolved via https://github.com/Islandora-Devops/islandora-playbook/commit/d4c416727c86c2439ae41d3fcb30bf9184ca1d7f