HeardLibrary / vandycite

0 stars 0 forks source link

Modify Commonsbot to check filenames for spaces #34

Closed baskaufs closed 2 years ago

baskaufs commented 2 years ago

When the raw filenames for works images contain spaces, the generated names for the IIIF manifest files contain spaces. This generates an error by the Wikidata API when the URLs to access them contain unallowed spaces.

Instead, spaces should be replaced by underscores. However, there may be the case where there are alternative image files for the work that differ in name only by having an underscore. Since this could potentially generate a filename conflict, the code should just log an error and skip processing that file. It would be better to fix the problem by just adding the underscore manually to the filename prior to processing. At that time, one could check that there were no other conflicting files.

baskaufs commented 2 years ago

Feature added in https://github.com/HeardLibrary/linked-data/commit/f91eaf0188f653849e9f020fd3c85851069621fb but not yet tested.

baskaufs commented 2 years ago

Completed testing, works fine.