Open ShivanKaul opened 2 years ago
In this case, we need to:
.
and take the 2nd component.url
in the resulting JSON blob.Here's what the JSON blob looks like:
{"aud":"detour","iss":"monolith","sub":"detour_link","iat":1661474196,"nbf":1661474196,"account_id":"5439759","delivery_id":"16qg6k01mcvlsfop8tne","url":"https://skipperotto.com/updates-on-our-5-pacific-salmon-species/?__s=zcraatsjbtesssn2kiyk"}
The __s
will get removed automatically by the query filter, once we fix https://github.com/brave/brave-browser/issues/22967.
Just a though, but initially when i spec'ed the debounce feature, i described a pipeline of different actions that could be applied to a URL to extract the destination URL. Things like "apply regex", "base64 decode the buffer", "extract JSON key", "extra query param key" and "extract path segment".
If we're going to start targeting these more sophisticated cases (which i think is a great idea) maybe it'd be good to revisit that original idea, so that we can have a smaller number of composable actions, instead of a large number of single-and-few-use actions?
This was the original proposal in case its of interest:
// This is just a mapping of function names, to functions with the
// `URLSegmentMapper` signature.
const URLSegmentMapperFuncs = {
atob: (x: string) => atob(x),
copy: (x: string, prefix = '', suffix = '') => `${prefix}${x}${suffix}`,
remove: () => "",
decodeURI: decodeURI,
decodeURIComponent: decodeURIComponent,
}
enum RewritingStepType {
// Means that the extracted URL value should be transformed,
// and the transformed string should be inserted back in the URL
// in place of the initially targeted value.
Map = "map",
// Means that the extracted URL value should be used _in place of_ the
// original URL (so that a subset of the current URL would be transformed,
// and then become the complete new current URL).
Replace = "replace",
}
type TargetPosition = {
start: number,
end: number,
}
type TargetResult = {
wasSuccess: boolean,
value?: string,
position?: TargetPosition,
}
// Targets describe parts of URLs (or other vales) that should be processed in
// some way.
abstract class Target {
abstract readonly type: string;
abstract apply(url: URL): TargetResult;
}
class TargetJSONKey extends Target {
readonly type: string = "TargetJSONKey";
// The key of the JSON-encoded value to extract.
readonly key: string;
}
class TargetQueryParam extends Target {
readonly type: string = "TargetQueryParam";
// The query paramter name to extract / target in this step.
readonly key: string;
// If provided, and the query param `key` in the target URL is an array,
// then this number describes which item in that array to choose (zero
// indexed).
readonly index: number = 0;
}
class TargetQueryKeyAndParam extends TargetQueryParam {
readonly type: string = "TargetQueryKeyAndParam";
// This class does the exact same thing as the parent class, except it
// intends to capture the query key as well as the value.
// Given the URL https://example.org?some=value,
// TargetQueryParam(key="some") would target "value",
// while TargetQueryKeyAndParam(key="some") would target "some=value".
}
class TargetPath extends Target {
readonly type: string = "TargetPath";
// The index of the path segment to choose (e.g., given "/my/sample/path",
// 0 would return "my", etc).
readonly index: number;
}
type RewriteStep = {
// How to identify which part of the URL to extract and map with this step of
// the pipeline. If omitted, use the entire current URL / buffer.
target?: URLTarget,
// How to transform the targeted / identified part of the URL, into a new
// string.
func: URLSegmentMapper,
// What to do with the returned, mapped URL substring, to replace the
// new version in place (e.g., when changing "target",
// https://example.org?target=old might become
// https://example.org?target=new), or use the new, mapped to value
// instead of the previous one.
type: RewritingStepType,
// Boolean describing whether, if anything goes wrong in the targeting
// step (i.e., determining which part of the URL to target), or the
// rewriting step (i.e., figuring out how to modify and/or use the targeted
// URL bit), whether (`true`) to keep going, and pretend like this step
// didn't exist, or (`false`) to stop processing further, and return error.
continueOnError: false,
}
// Rule definition
type URLRewritingRecipe = {
// One or more strings, encoding
// [URLPatterns](https://source.chromium.org/chromium/chromium/src/+/main:extensions/common/url_pattern.h;l=49;bpv=1;bpt=1?q=URLPattern&ss=chromium)
// Note, that if we need more flexibility, these could be replaced with
// [adblock-rs](https://www.npmjs.com/package/adblock-rs) format rules.
// These describe which URLs should be considered by the additional steps
// for this rule.
urlPatterns: string[],
steps: RewriteStep[]
}
// Example 1: https://bad.com?uid=123&destination=https%3A%2F%2Fgood.com
const example1 = {
urlPatterns: [
"https://bad.com/*"
],
steps: [
{
target: {
type: "URLTargetQueryParam",
key: "destination",
// index = 0 is assumed, and so will be omitted from future examples.
index: 0,
},
func: "decodeURIComponent",
type: "replace",
// False here is assumed, and so will be omitted from the rest of the
// examples.
continueOnError: false,
}
]
};
// Example 2: Strip the Facebook click id (fbclid) from all navigation URLs.
const example2 = {
urlPatterns: [
"?fbclid=",
"&fbclid=",
],
steps: [
{
target: {
type: "URLTargetQueryKeyAndParam",
key: "fbclid"
},
func: "remove",
type: "map",
}
]
};
// Example 3: Some really mean jerks do something horrible like encode
// redirection instructions, in JSON, base64'ed, in a path parameter.
// Something like:
//
// # First put the instructions in JSON.
// const step1 = JSON.stringify({dest: "https://good.com"});
//
// # Then encode that as base64.
// const step2 = window.btoa(step1);
//
// # Then put that in the path of the bounce trackers URL.
// const step3 = `https://tracker.com/bounce/${step2}/go`;
//
// Giving the following URL
// https://tracker.com/bounce/eyJkZXN0IjoiaHR0cHM6Ly9nb29kLmNvbSJ9/go
const example3 = {
urlPatterns: [
"https://tracker.com/bounce/*",
],
steps: [
{
target: {
type: "URLTargetPath",
index: 1,
},
func: "copy",
type: "replace",
},
{
func: "atob",
type: "replace",
},
{
target: {
type: "TargetJSONKey",
key: "dest",
},
func: "copy",
type: "replace",
}
]
};
Base64 + JSON: https://www.emjcd.com/links-i/?d=eyJzdXJmZXIiOiIxMDA3MDQyOTIzNzA2MDQ1MTg6SU91QlNqanZBMHZjIiwibGFzdENsaWNrTmFtZSI6IkxDTEsiLCJsYXN0Q2xpY2tWYWx1ZSI6ImNqbyF3dnhqLWEwZThkazctd3ppZy1kc25oanZhLWdoaWktbG5jbWJuei13NXRlLWRzNzNwbmUtdzc3Mi12MDRjcTNkIiwiZGVzdGluYXRpb25VcmwiOiJodHRwczovL3d3dy5jYXJoYXJ0dC5jb20vcHJvZHVjdC84MDE5NjYvZm9yY2UtcHJvLTM1bC1iYWNrcGFjayIsInNpZCI6Ii0tLSIsInR5cGUiOiJkbGciLCJwaWQiOjEwMDIyMzY2NCwiZXZlbnRJZCI6IjUwNjYzNTRiNGU0OTExZWM4MzY4MDNkZTBhMWMwZTBiIiwiY2pTZXNzaW9uIjoiYjMyZDQzYjItNjVkMi00NTZiLThjMmItZTY2NmExN2I4OTY0IiwibG95YWx0eUV4cGlyYXRpb24iOjAsInJlZGlyZWN0ZWRUb0xpdmVyYW1wIjpmYWxzZSwiY2pDb25zZW50RW51bSI6Ik5FVkVSX0FTS0VEIn0%3D
What needs to be done:
d
parameter from the query string.destinationURL
in the resulting JSON blob.Here's what the JSON blob looks like: