AdguardTeam / CoreLibs

Core Adguard libraries
https://adguard.com/
Apache License 2.0
39 stars 7 forks source link

Add an option to decode URL in `$urltransform` #1915

Open AdamWr opened 2 weeks ago

AdamWr commented 2 weeks ago

Issue Details

It's related to - https://github.com/AdguardTeam/CoreLibs/issues/1557#issuecomment-2351459285

Currently if we want to redirect to another origin from the link which contains destination page as one of the parameters and this part of URL is encoded, then it's necessary to decode some characters. For example, this link:

https://track.effiliation.com/servlet/effi.redir?id_compteur=12305754&effi_id=1646343493&url=https%3A%2F%2Ffr.shopping.rakuten.com%2Foffer%2Fshop%2F11769144290%2Fdyson-v8-absolute-aspirateur.html%3FsellerLogin%3DBoulanger

The destination page is in url parameter, but it's encoded:

https%3A%2F%2Ffr.shopping.rakuten.com%2Foffer%2Fshop%2F11769144290%2Fdyson-v8-absolute-aspirateur.html%3FsellerLogin%3DBoulanger

so it's needed to decode some characters:

%3A
%2F
%3D

These rules:

/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:track\.effiliation\.com\/servlet\/effi\.redir|dealabs\.digidip\.net\/visit\?url=)/$urltransform=/%3A/:/
/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:track\.effiliation\.com\/servlet\/effi\.redir|dealabs\.digidip\.net\/visit\?url=)/$urltransform=/%2F/\//
/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:track\.effiliation\.com\/servlet\/effi\.redir|dealabs\.digidip\.net\/visit\?url=)/$urltransform=/%3F/?/
/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:track\.effiliation\.com\/servlet\/effi\.redir|dealabs\.digidip\.net\/visit\?url=)/$urltransform=/%3D/=/
/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:track\.effiliation\.com\/servlet\/effi\.redir|dealabs\.digidip\.net\/visit\?url=)/$urltransform=/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:effiliation\.com|dealabs\.digidip\.net).*url=([^&]*)/\$1/

seems to work fine, but if we would have a decode URL option, then we could use just something like:

/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:track\.effiliation\.com\/servlet\/effi\.redir|dealabs\.digidip\.net\/visit\?url=)/$urltransform=/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:effiliation\.com|dealabs\.digidip\.net).*url=([^&]*)/\$1/decodeURL

Proposed solution

Add an option to decode URL, maybe as an additional modifier. Or if it's already possible or can be done somehow easily in one rule, then it would be nice to add it to documentation.

Alternative solution

No response

cxplay commented 2 weeks ago

I think it would be best if this was a separate modifier:

/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:track\.effiliation\.com\/servlet\/effi\.redir|dealabs\.digidip\.net\/visit\?url=)/$urltransform=/^https?:\/\/(?:[a-z0-9-]+\.)*?(?:effiliation\.com|dealabs\.digidip\.net).*url=([^&]*)/\$1/,decodeurl

Probably enhanced as part of the urltransform modifier, which decodes the redirect target before redirecting it.