keeps / roda-in

Tool to create Submission Information Packages (SIP)
http://rodain.roda-community.org
GNU Lesser General Public License v3.0
23 stars 11 forks source link

File names with '+' are not URL encoded #352

Closed JelleKleevensVAI closed 3 years ago

JelleKleevensVAI commented 3 years ago

This issue seems to span both Roda-In and Roda. Using Roda-In 2.3.0 and Roda 4.

It appears that Roda-In converts each space (' ') in a filename to a plus ('+') in the SIP METS fileSec. Roda appears to convert each plus back to a space during the SIP ingest workflow.

Given a file that contains plus characters in the filename, this results in an error of the following type during SIP ingest workflow metadata validation: _The rep1 file 20111102_S12JAAR_selectie 16 + 14 projecten.pdf results in ERROR representations/rep1/data/VAI_0099-CK_VAI JAARBOEK 10-11/PROJECTLIJSTEN/20111102_S12JAARselectie 16 14 projecten.pdfRepresentation file referenced in METS.xml not found.

Likely this is because Roda wrongfully converted real plusses to spaces as well.

luis100 commented 3 years ago

Files with '+' in the name are not being URL encoded. This make RODA translate the '+' into a ' ', but that is the expected behaviour as file names must be URL encoded.

Steps to reproduce:

  1. Create a file with '+' in the name
  2. Use RODA-in 2.3.0 to create the SIP
  3. Analyse the resulting SIP METS file and check if + has been URL encoded to %2B
luis100 commented 3 years ago

Issue as been fixed in https://github.com/keeps/roda-in/releases/tag/2.3.1 @JelleKleevensVAI please confirm.

JelleKleevensVAI commented 3 years ago

I can't open version 2.3.1 on Windows to test it. Version 2.3.0 works fine though.

JoaoGomes2110 commented 3 years ago

The problem with version 2.3.1 has been fixed, you can download it again here https://github.com/keeps/roda-in/releases/tag/2.3.1.

JelleKleevensVAI commented 3 years ago

Yep seems to work fine now, thanks.