einaregilsson / Redirector

Browser extension (Firefox, Chrome, Opera, Edge) to redirect urls based on regex patterns, like a client side mod_rewrite.
http://einaregilsson.com/redirector/
MIT License
1.54k stars 160 forks source link

Note to users: Dev seems to have passed away; See reply if you are interested in taking on development of this extension. #329

Open rdmuser1 opened 1 year ago

rdmuser1 commented 1 year ago

His facebook linked through his personal site is set to memorialize and all online activity ended late last year.

gzur commented 1 year ago

You are correct, Einar passed away early this year. My name is Gissur, and I have assumed stewardship of his Github account on behalf of his widow.

I am currently looking for somebody to assume ownership of the Redirector so that Einar's legacy can be continued.

Gitoffthelawn commented 1 year ago

@gzur I am still willing to manage the project. For details, please see https://github.com/Gitoffthelawn/Contact-Gitoffthelawn/issues/8

I see my roles as primarily:

I think the above will involve a substantial time commitment, so I will rely on others for most of the coding changes.

iustin94 commented 1 year ago

Has there been any progress on ownership of the plugin? Condolences to the widow ...

Gitoffthelawn commented 1 year ago

Has there been any progress on ownership of the plugin? Condolences to the widow ...

I haven't received any response yet.

gzur commented 1 year ago

@Gitoffthelawn you should have an invitation to collaborate in your inbox.

Gitoffthelawn commented 1 year ago

@Gitoffthelawn you should have an invitation to collaborate in your inbox.

Got it. Accepted. Thank you.

I just performed a small change to the README.md file to ensure everything worked correctly, and it did.

pabs3 commented 1 year ago

I'd like to suggest creating a new GitHub organisation for Redirector and transferring this repo there. That way you make it clear that the original author is no longer involved and also preserve all the old issues, pull requests etc.

Jupiter-Liar commented 3 months ago

Where are we on porting this to Manifest V3? I've been trying to convert it to V3 on my own for the past two days. I'm banging my head against the wall, but I think I'm on the right track now.

Gitoffthelawn commented 3 months ago

@Jupiter-Liar I've been waiting to get the remaining credentials needed to publish updates. If you get MV3 working, and I get the credentials, then we should be good.

Jupiter-Liar commented 3 months ago

After continued grappling, I don't know if I'm going to be able to convert it. If anyone else has efforts underway, don't let me hold you up.

tillcash commented 3 months ago

@Gitoffthelawn, could you publish this fork for the time being?

Jupiter-Liar commented 3 months ago

I can publish it. But it's in a sad state. It doesn't convert the capture groups: it just directs you to pages with things like $3 in the addresses. And when you turn them off, they redirect you anyway. It's lousy. I don't know if there's even a way to apply exceptions to individual rules, rather than applying the exceptions globally, at least with declarativeNetRequest. The thing to do might be to build separate methods for different kinds of content. And that'll take work. Most of background.js may have to be rebuilt.

https://github.com/Jupiter-Liar/Redirector---V3-Port-Attempt

Jupiter-Liar commented 3 months ago

I am continuing to study the problem. I have ideas.

Jupiter-Liar commented 3 months ago

An alpha! We have an alpha!

https://github.com/Jupiter-Liar/Redirector---V3-Port

I've tested basic page redirects and basic images so far. They both work. The methods are different, though. There is now a content script that works on the page level, changing src attributes. I'm certain we will need to add additional logic for different kinds of embedding. This is very much a work in progress. But it's a good place to stop for the night, before we plunge ahead with further revisions.

Einar assuredly could have done it much, much better than I have. Still, I'll try to do him proud.

On the redirector.html page, I put a dagger next to Einar's name, to indicate that he has died. How should we new developers credit ourselves?

Those of you who inspect the code will notice that I've gone wild with log messages. It was necessary to understand how the script worked and what values it was getting. There are now switches galore to enable different parts of the logging process.

Go ahead and test things out. Let's figure out what doesn't work yet. There will probably be a good amount of work still to be done. But I think this is the way forward. Don't be afraid to build more revisions or incorporate changes from other forks. I don't think that I want to be the sole developer.

Gitoffthelawn commented 3 months ago

@Jupiter-Liar Thank you for your efforts!

I want to be respectful and considerate of your generous efforts while still honestly and accurately expressing a concern. So please don't take the following as criticism... it's just a technical concern. I don't want to discourage you in any way as I sincerely appreciate your efforts. ❤️

Changing src attributes will work in many cases, but I think it will leave too many cases unhandled. I wonder if it is an effective choice to process at the page level since a lower level needs to be processed as well.

What do you think?

Jupiter-Liar commented 3 months ago

@Jupiter-Liar Thank you for your efforts!

I want to be respectful and considerate of your generous efforts while still honestly and accurately expressing a concern. So please don't take the following as criticism... it's just a technical concern. I don't want to discourage you in any way as I sincerely appreciate your efforts. ❤️

Changing src attributes will work in many cases, but I think it will leave too many cases unhandled. I wonder if it is an effective choice to process at the page level since a lower level needs to be processed as well.

What do you think?

I know what you mean. But here's the problem I'm facing:

Einar's extension handled redirection using a mechanism called webRequest blocking. That was what handled the redirection. webRequest was able to block the request and redirect it somewhere else. Well, they got rid of webRequest blocking. That's the thorn. Other parts of webRequest work, but the blocking, which is necessary for it to perform the redirection, is no longer supported in Manifest V3.

What is supported at a low level is declarativeNetRequest. It uses a different wildcard system, but we could put in a kind of converter. But there's a problem; it doesn't support exceptions. It uses a limited regex system called RE2, which doesn't allow us to exclude things. webRequest was able to match the resources according to intelligent rules which included exceptions. declarativeNetRequest can't do it. There's a way it can sort of do it; you make the exception with a higher ID number than the base rule. But there's a problem; now that exception applies to all rules with a lower ID number. Not just the rule where we put it: all rules with a lower ID number. Finding a way around that will be difficult, perhaps even impossible. I don't know if I could get it to work 100%. There could still be cases where an exception broke other rules than the one where it was declared. Now, maybe we could minimize this by having it go through all the rules and give those with exceptions the lowest ID's, but even then, some of them might interfere with one another.

While I was pursuing the decalarativeNetRequest route, it just seemed too limited and too headache-inducing. I had a test version that used declarativeNetRequest. My next step with that would have been a proper converter for the wildcards and capture groups. But then I discovered what a nightmare exceptions would be. An alternative method seemed like it would work better.

And by the way, I learned about this stuff from working on this project. I'm learning as I go.

Now, there's also a method called fetch, but that seems to have even bigger problems. Namely, cross-origin resource sharing: CORS. Basically, if you have two servers that don't want to play nice, and you try to request something from server B while you're on a webpage on server A, it'll just give you an error. And there may be nothing you can do about it. It seems even less workable than declarativeNetRequest.

Either way, with declarativeNetRequest or fetch, it seems to be a big case of can't get there from here. And it's Google's fault. They messed everything up. They moved us from an intelligent system to a dumb one.

Using a content script to change things in the DOM can have its own challenges. It will have to find and modify the relevant elements (that rhymes) on the page, which means it will have to know the different ways things can be embedded. And if a website installed scripts to revert any changes to the DOM, that script and our extension could get in a brief tug-of-war, which would end in our extension giving up. But I expect those instances would be rare.

Maybe we'll end up having to add an option to make individual rules declarative: an all-else-fails option. But even aside from that, I've got some ideas for the content script that could make it pretty powerful, too. Nuclear options, where it goes through every attribute of every element on the page, and if that fails, looks at computed attributes that may have been assigned through CSS. I mean, I think the content script, once it's finished, could tackle most use cases.

And the content script can make use of more of the existing architecture. So it's the expeditious option. It also has some other benefits. Like if you right-click an image and copy its url, or right click it and use an extension to do a reverse image search, you get the URL of the replacement image, the image you see, not the image that was originally there but was replaced.

I'd love for a more talented coder than I to take up the charge. I don't know whether I can get everything working. Some of this territory is fairly familiar, but there are still some uncharted waters, like history state and AJAX. I'll have to figure things out. To paraphrase Schmendrick, you deserve the services of a skilled computer programmer, but I hope you'll be happy for the aid of a fledgeling hobbyist.

Jupiter-Liar commented 2 months ago

Running into new problems. webRequest isn't always being triggered, possibly because things get cached and the browser can just pull from that cache. I may end up having to use declarativeNetRequest after all.

It's likely the reality. I'm still figuring it out. But we may have to ask our users to define their exceptions as narrowly as possible.

I'm trying to find a good way forward. I am.

Gitoffthelawn commented 2 months ago

@Jupiter-Liar Messages received, and I'm looking forward to reading them. Full schedule here for the moment, and I want to allocate sufficient time to thoroughly read what you wrote. I'm hoping to have time later this week to read and reply. Perhaps earlier if I take a little break!

Jupiter-Liar commented 2 months ago

I've continued studying the extension. There is a way forward with declarativeNetRequest. I'm figuring out the specifics. But conceptually, I now know what to do.

Plan: Rework wildcard conversion to the new regex version. Adapt the existing converter to output declarativeNetRequest-compatible rules. All redirects with no exceptions get higher ID numbers automatically, so rules with exceptions will not interfere with them. Aside from that, utilize Einar's ordering logic: rules higher on the settings page get a higher priority, and therefore, higher ID numbers.

Managing conflicts was already a task assigned to the user, and it has always been manageable. They already have the tools to do it.

Users will need to adjust to the change in two ways:

  1. They may need to shuffle rules or narrowly redefine exceptions to limit the new types of conflicts.
  2. Those that write in regex will have to rework their rules into the RE2 format. This seems to be pretty much unavoidable.

I see the way forward now.

sbolel commented 2 months ago

Hello all! I'm a software engineer interested in helping maintain Redirector. I've used it extensively in the past, but not recently within the last year. If you're still looking for contributors, I'd be happy to get involved and see where I can help.

Jupiter-Liar commented 2 months ago

Hello all! I'm a software engineer interested in helping maintain Redirector. I've used it extensively in the past, but not recently within the last year. If you're still looking for contributors, I'd be happy to get involved and see where I can help.

Wanna compare notes? I can show you where I am in the process.

Jupiter-Liar commented 2 months ago

I have uploaded a new alpha. I've only tested main frame and image redirects. But so far, they work. I'm proud of this one.

It uses declarativeNetRequest. It should be structured in such a way that rules with exceptions cannot interfere with rules without exceptions, although I have not tested this.

If this becomes the basis for a new version, we'll want to update the documentation and rename some of the functions to reflect their new jobs. Some functions no longer have anything to do, and for the time being, they are commented out.

In the process of working on the script, I filled it with copious logging messages. Logging for each function can be enabled near the top of the script, and there's also an option to just enable all logs. For now, the logs I was actively watching have been left on.

Jupiter-Liar commented 2 months ago

How should I reach you?

pabs3 commented 2 months ago

I read that Mozilla are going to be keeping webRequest blocking when Firefox uses Manifest v3. Will Redirector keep a maintained version that still uses webRequest blocking, for the Firefox users amongst us?

Jupiter-Liar commented 2 months ago

I'm getting to the point where I'm going to need people to test my alphas and tell me what does and doesn't work.

Should we move V3 development to its own thread or forum someplace else? I don't want to hijack this thread any more than I already have.

In my current stage of development: Some things DEFINITELY work. Most things PROBABLY work. A few select things, like responsive images, definitely won't work yet. I still need to figure those out. We've got one asynchronous error relating to redirect.html which I have yet to track down. I haven't updated documentation beyond putting a dagger after Einar's name. Some people will definitely need to rewrite or re-order their rules a bit. Regex people especially, because we're in RE2 land now.

In order to get some things working, I'll honestly need to have examples where the things are supposed to work but don't. Specific web pages where I can see the failures happen. Specific approaches to code entry or implementation that fail.

Jupiter-Liar commented 2 months ago

@Gitoffthelawn, can we git some people to test the latest alpha?

Should I remove the key from the manifest to help facilitate that?

I don't know if every single feature will work. In fact, I'm sure some won't. But I do think things will work well enough now that Redirector is safe from extinction.

polyzen commented 2 months ago

Can an alpha build be found somewhere?

Jupiter-Liar commented 2 months ago

https://github.com/Jupiter-Liar/Redirector---V3-Port

Before you give it a try, export all your rules to make sure they're safe. Just in case.

If you want to install it alongside the regular Redirector, you can remove the key section of the manifest file. Otherwise, it takes the place of the official release. In fact, I just removed the key from the manifest file just to make it easier to install alongside the regular version.

tathastu871 commented 1 month ago

If anyone porting it, add full javascript regex support Most imoortantly 'Named-CAPTURING GROUP'

For some intermediate level not much complex urls we need to perform Extract capture groups and Join then and when order of capture groups are not consistent 'numerical capture $1 $2 are useless.

Hence Named capturing groups are needed

Gitoffthelawn commented 1 month ago

...Hence Named capturing groups are needed...

Can you please provide an example where named capture groups are needed where numerical capture groups will not meet the need?

tathastu871 commented 1 month ago

...Hence Named capturing groups are needed...

Can you please provide an example where named capture groups are needed where numerical capture groups will not meet the need?

Yes, Say i have a url, i need to strip all query parameters

i can do (http.*)?.* --> $1

BUT, I want to retain certain two or three parameters, We can do

(http.*)?someParamtoignore.*([?&]Param1=[^&]*).*([?&]Param2=[^&]*).*
--> $1$2$3

But some urls the order of Param1 and Param2 may be not same in some Param2 may be before Param1 also sometimes Param2 present but Param1 absent

To make generalised REGEX I tried

(http.*)?someParamToIgnore.*([&?)(Param1|Param2)=[^&]*)

HERE WE NEED TO PRESERVE TWO PARAMS SO REPEAT STRING

([&?)(Param1|Param2)=[^&]*){2}

HERE {2} DOESNT WORK WITH REDIRECTOR

so i tried

(http.*)?someParamToIgnore.*([&?)(Param1|Param2)=[^&]*).*([&?)(Param1|Param2)=[^&]*).* --> $1$2$3

Problem 1) That requires Named Capturing It doesnt respect the order The final url we want require strict order eg ?Param1 &Param2

But if url has param2 first then param1 it will give

http.*?param2=some&param=some

But we want strict Order

?param1=some&param2=some 

in final url

Problem 2) That requires Named Capturing

Also when param1 is not present and param2 is present

Then above solution fails

trying something like making repeating group optional

([&?)(Param1|Param2)=[^&]*)?.*([&?)(Param1|Param2)=[^&]*)?

Doesnt work with redirector

Solution is to extract groups then .join() but that is not possible with redirector

We need ONE LINER REGEX SEARCH REPLACE

SO SOMETHING LIKE Name captured might work However still named capturing doesnt solve the problem of 'STRICT ORDER'

But It may solve problem of optional groups

Gitoffthelawn commented 1 month ago

Yes...

Thank you for the detailed example. I'll need to think thoroughly about it.

In the meantime, here are 2 tips that may help you:

  1. Don't be afraid to use 2 (or more) Redirector rules to match the same host, but with different parameters. It will work fine, and can aid in rule readability. I tend to prefer maxing out the functionality of regex by putting all matches for a host in a single regex, but sometimes more than 1 regex for a single host is a more effective choice.
  2. Exclusions are your friends! Often Redirector's exclusion functionality can really save the day when creating complex rules. When using 2 (or more) Redirector rules to match the same host, exclusions get even more powerful.
tathastu871 commented 1 month ago

Yes...

Thank you for the detailed example. I'll need to think thoroughly about it.

In the meantime, here are 2 tips that may help you:

  1. Don't be afraid to use 2 (or more) Redirector rules to match the same host, but with different parameters. It will work fine, and can aid in rule readability. I tend to prefer maxing out the functionality of regex by putting all matches for a host in a single regex, but sometimes more than 1 regex for a single host is a more effective choice.
  2. Exclusions are your friends! Often Redirector's exclusion functionality can really save the day when creating complex rules. When using 2 (or more) Redirector rules to match the same host, exclusions get even more powerful.

Hey how redirector quantifier {} works

because (pattern){1,2} dont work (pattern){2} works but it only gives 2nd occurence i want 1st 2nd 3rd occurence of pattern but it not working in redirector

In devtools url.match(regex) gives me all occurences when /regex/g global

Without quantifier i have to match (regex).+?(regex).+?(regex) This matches 3 occurence Now if url has only 2 repetitions then it fail

Making individual regex for seperate urls requires muliple combination Just consider how many combination of [123] Around 39 combination menas 39 rules for simple one url

So far for stripping all query parameters from url while retaining certain i came up with

((?:[?&](?:Param1|Param2|Param3)=[^&]*))

--> This will retain desired parameters in their original order you can't reorder them to your wish unless you use 'NAMED CAPTURING' because we dont know what will be $index will match which param in url

It gives me at $1 if suffix {1} first ocuurence,{2} gives second occurence

In devtools match=url.match(regex) match[1]+match[2]+match[3]

Produces joined desired params

If redirectoe supports

((?:[?&](?:Param1|Param2|Param3)=[^&]*)){1,3}

Then $1 --> will give ?Param1=some&Param2=some&Param3=some

tathastu871 commented 1 month ago

https://groups.google.com/a/chromium.org/g/chromium-extensions/c/4971ZS9cI7E

Will these help for exclusion related problem

Jupiter-Liar commented 1 month ago

https://groups.google.com/a/chromium.org/g/chromium-extensions/c/4971ZS9cI7E

Will these help for exclusion related problem

That seems to be limited to domains. Exclusions are a broader thing than that.

The solution I worked out, which I think is the best compromise... well let me explain:

  1. Rules with no exceptions get the highest priorities.
  2. Rules with exceptions come next, divided in the following way: a. The exception to a specific rule. b. The specific rule itself.

So an order might go something like: Rule 1 (no exception) Rule 2 (no exception) Exception to Rule 3 Basic version of Rule 3 Exception to Rule 4 Basic version of Rule 4

The problem is that exceptions, technically, become their own rules now, and any rule can overrule any other rule that has a lower priority. This means that in the above example, the exception from Rule 3 could potentially overrule Rule 4, if they were in conflict.

And the solution I chose was to put rules with exceptions lower on the priority list, so they could only interfere with one another. This cuts down on the number of potential conflicts.

As for regex, we're now in a declarativeNetRequest world. The old version of Redirector depended on webRequest blocking, and webRequest blocking has been axed for Manifest V3. That means we've gone from full, rich regex to RE2. We're limited to what we can do with RE2, which is quite a bit less than we could do before.

I'm considering some options for alternate modes. One mode I'm considering could get URL encoding and decoding working again; they don't work currently. Once I have this Mode 2 in place, it could even be used for @MrYuto\'s improvement, which was a good one. And for advanced regex, if it's absolutely needed, a Mode 3 might be incorporated, which would use webNavigation as a gateway; I'm still thinking it over. But this would only work for main frames and subframes. That's my current understanding.

If I'm understanding your predicament, you're talking about matching a url that had three very specific parameters, and maybe it would only have some of them, and they could be in ANY order... some kind of advanced Mode 3 could potentially match what you're talking about, at least for main frames and subframes. You're talking about a situation in which ALL three parameters are present, right? You would use positive lookahead — a thing which RE2 does not support, but more advanced regex could. The lookahead would be something like: (?=.*(?:Param1=([^&]*)|Param2=([^&]*)|Param3=([^&]*)) I think I have that right.

As for how many possible combinations of those parameters there are, it certainly isn't 39. If we're considering situations where any number of the parameters appear, the answer is 15. That's 3! (which is 6) for the combinations of all three, then 3! again for each possible group of 2, and 3 more for the rules that only have one parameter. That's still a whole lot, I agree. But at least it isn't 39.

tathastu871 commented 1 month ago

Requesty Extension allows redirection uses v3 Dont know upto how much it uses DNR it uses combo of DNR + NAVIGATION + CONTENT SCRIPT MAYBE TAKE LOOK AT ITS CODE

tathastu871 commented 1 month ago

Feature Request : For current V2 port if anyone else still going to maintain

1) Allow Users to Use custom Modifiers Flags : g, m, i, u, s, y Any Combination of those

2) Support for both indexed and Named Groups

2) In popup.html When include pattern matches provide a snippet of matched groups both indexed and named. It eases the creation of redirect url by quickly looking up matched references

3) For complex cases where simple one liner regex wont work, like in edit, join, replace, reorder parts of captured groups itself

Allow Users to provide a function that that takes in matched match/test/exec result as argument user can add their own code to modify match as per their need and return redirectUrl

There are many such userscripts i have seen in past that along with static one line regex rules allows dynamic creation of urls using function

4) Use Url.matchAll instead of Regex.exec/test As exec only matched 1st occurence also with exec + while loop may go into infinite loop

tathastu871 commented 1 month ago

Sometimes we want to redirect but also keep original site

So if it is possible to add settings that redirect but open in new tab

gregsdennis commented 2 weeks ago

I just noticed that the redirects stopped working for me within the past week. The rabbit hole led me here.

I'd like to express my appreciation to the community (and @Jupiter-Liar in particular) for making an effort to continue development of this project. I eagerly await a new version.

Gitoffthelawn commented 2 weeks ago

@gregsdennis Can you create a new issue regarding redirects no longer working for you? Please include the browser name, its version, the OS name, and its version.

I haven't seen lots of other reports of Redirector suddenly not working, so I'm thinking it's likely the browser's storage for Redirector got corrupted in your case. If you uninstall Redirector, re-install it, and then import your redirect rules (you hopefully made a backup!), I'm hopeful it will work for you. If you didn't already make a backup, make one now, and hopefully it will contain all your Redirector rules (good advice for everyone, BTW!).

Gitoffthelawn commented 2 weeks ago

@gzur I've been trying to reach you via our conversation on https://github.com/Gitoffthelawn/Contact-Gitoffthelawn/issues/8, but if you're not receiving notifications for that thread, perhaps you will receive a notification for this one. Thanks!

gregsdennis commented 2 weeks ago

@Gitoffthelawn the uninstall/reinstall worked for me. Thanks for the hint. For completeness, I just use it to remove the google text highlighting. Redirector.json

Gitoffthelawn commented 2 weeks ago

@gregsdennis You're welcome. Happy to hear it worked for you! BTW, in which browser did you perform the reinstall of Redirector?

gregsdennis commented 2 weeks ago

I was in Chrome. I had gone to the extension website and Chrome has a compatibility alert on it so I thought maybe it finally just stopped working. That led me here.

Gitoffthelawn commented 2 weeks ago

@gregsdennis Thanks Greg. I'll make a mental note that Chrome can have storage instability.