Closed dr0i closed 1 year ago
In general, I would prefer to use the Metamorph implementation directly (see e.g. #230 and #231). If it doesn't provide a suitable interface it should be added there and then reused for both. Eventually, org.metafacture.metamorph.functions
might be generalized and moved to org.metafacture.commons.functions
(or something like that).
In general, I would prefer to use the Metamorph implementation directly
good point. Agreed. Awaiting ok for using percent encoder and then migrate function to metafcature-core.
We'd also have to add documentation on the new function in the README.
Right - forgot about that. Would do this also im metfacture-core. Will note the RFC 3986 compliance there.
One thing leading to confusion here (at least for me) is that we actually need something to encode a string to be used in a URI (in particular as a path segment), not to encode a full URI. So for http://perl.com/foo bar
we don't want http://perl.com/foo%20bar
(which would be a valid URI), but we want http%3A%2F%2Fperl.com%2Ffoo%20bar
(which can be used as a path segment in a URI, but is not itself a valid URI).
Now if I understand it correctly, both metamorph.functions.URLEncode and Catmandu::Fix::uri_encode basically do that, but Java's URLEncoder
(which is currently used in metamorph.functions.URLEncode
) uses +
instead of %20
for blanks, which is a problem for using it in lobid-resources.
To both avoid breaking the current metamorph.functions.URLEncode
behavior and reusing as much implementation as possible, this is what @dr0i and I just discussed: we could base metamorph.functions.URLEncode
on PercentEscaper
, with an option to pass the safeChars
config. In the default used in metamorph
, we use SAFECHARS_URLENCODER, which "mimics the behavior of java.net.URLEncoder" (thus no break), and in metafix
we use e.g. SAFE_PLUS_RESERVED_CHARS_URLENCODER (to get what we need in lobid-resources and be consistent with Catmandu::Fix::uri_encode
).
This PR is substituted by https://github.com/metafacture/metafacture-fix/pull/314. Closing.
Resolves #273.