kitodo / kitodo-production

Kitodo.Production is a workflow management tool for mass digitization and is part of the Kitodo Digital Library Suite.
http://www.kitodo.org/software/kitodoproduction/
GNU General Public License v3.0
63 stars 63 forks source link

Problem with encoding of queries with identifier #4353

Closed BartChris closed 3 years ago

BartChris commented 3 years ago

Hello,

it seems like the fix introduced in https://github.com/kitodo/kitodo-production/pull/4187 has unintended(?) side effects since there seems to be no difference between URLs where the Identifier is embedded in another query paramter as in the HEBIS search: http://sru.hebis.de/sru/DB=2.1?version=1.1&operation=searchRetrieve&recordSchema=pica&recordPacking=xml&maximumRecords=1&query=pica.ppn=406442703 (see https://github.com/kitodo/kitodo-production/issues/4185)

and other scenarios.

A query like

https://genericapi.com?id=123456 seems also to be encoded as https://genericapi.com?id%3D123456 where id therefor cannot be processed as a CGI-parameter.

https://github.com/kitodo/kitodo-production/blob/04253d1b5c817187a55b54aced06ddb69630cc54/Kitodo-Query-URL-Import/src/main/java/org/kitodo/queryurlimport/QueryURLImport.java#L282

Maybe there has to be a check, wether the query is targeted at a SRU interface or not and only encode the query URL in case of a SRU interface. It seems like something like that is also done in this place:

https://github.com/kitodo/kitodo-production/blob/04253d1b5c817187a55b54aced06ddb69630cc54/Kitodo-Query-URL-Import/src/main/java/org/kitodo/queryurlimport/QueryURLImport.java#L436

solth commented 3 years ago

I observed a similar problem when quering an OAI interface: when identifying a single record with the identifier URL parameter, the subsequent = should not be URL encoded, otherwise the query will fail.