stoically / temporary-containers

Firefox Add-on that lets you open automatically managed disposable containers
https://addons.mozilla.org/firefox/addon/temporary-containers/
MIT License
846 stars 57 forks source link

Per-domain isolation excludes do not take mouse clicks into consideration #606

Open Binarus opened 1 year ago

Binarus commented 1 year ago

Dear all,

FF 105.0.3 TC 1.9.2

Problem: At the global level, I have all sites isolated. This is sufficient for 99% of the sites. However, I am running in problems with the German ebay site (ebay.de). Many of the actions we can do there make TC open a new page in a different container where we have to login again. Therefore, I have tried to configure isolation per domain for that site, using regular expressions.

Basically, if the tab domain is any of the ebay domains or subdomains, I'd like to exclude any of the ebay domains or subdomains from mutual isolation. For example, if I am on a.ebay.de, I'd like to have b.ebay.com opened in the same container. I have tried to achieve that with regular expressions. I have created a per-domain isolation with the following values:

Domain pattern: /^([^.]+\.)*ebay\.(de|com)\/.*$/
Always open in: Disabled
Navigation: Use global
Mouse click: Use global (for all three options)
Exclude target domains: /^([^.]+\.)*ebay\.(de|com)\/.*$/

So the domain pattern and exclusion pattern are the same.

This does not work as expected. For example, if I am on www.ebay.de and want to read my messages via "My ebay" menu, it tries to switch to the domain mesg.ebay.de, which is then opened in a new temporary container (where I have to log in again) because TC blocks the request. I have verified that the latter happens by looking at the network traffic in the developer tools. Firefox there shows the GET request for mesg.ebay.de and tells clearly: "Blocked By Temporary Containers".

Could somebody please explain what I am doing wrong?

Thank you very much in advance,

Binarus

stoically commented 1 year ago

Your approach generally works, but there's two issues I see with your regex

  1. URLs don't always end with a slash, e.g. if you just open www.ebay.de in a tab, it won't have a trailing slash
  2. /^([^.]+\.)* makes it so that if there's no subdomain involved, it tries to match /^ebay..., which won't work because it matches against the full URL including scheme (https), so the regex would need to include that.

I'd suggest /^https?://(.+\.)?ebay\.(de|com)(/.*)?$/

Binarus commented 1 year ago

At first, thank you very much for your answers. I'll comment them below. But before, I'd like to state that I have some experience with regular expressions in general, but I do not know how the Javascript REs differ from other REs (e.g. egrep, awk, Perl). Please correct my following statements if they are wrong.

So you'd have to include the scheme (e.g. https) in your regex as well, then it should work.

I already had read in the instructions that the regex match is against the full URL :-)

2. /^([^.]+\.)* makes it so that if there's no subdomain involved, it tries to match /^ebay..., which won't work because it matches against the full URL including scheme (https), so the regex would need to include that.

My original pattern will correctly slurp the leading https:// part because there is no . (dot) in it. You are right that my original pattern is not prepared to handle URLs without a subdomain. But I didn't care about that yet because it was mesg.ebay.de that got blocked, so that couldn't be the problem.

  1. URLs don't always end with a slash, e.g. if you just open www.ebay.de in a tab, it won't have a trailing slash

Arrghh. I see. Thank you very much for this hint. While I am somehow experienced with REs, I have missed that gotcha with domain-only URLs. But anyway this couldn't be the problem because of the tab URL and the target URL in the situation in question (see below).

To address both problems and make it a bit safer, I have changed the pattern to /^https?://([^./]+\.)*ebay\.(de|com)(/.*)*$/. That didn't work either.

Then I have literally used the pattern you suggested, and this one also did not work. With my new pattern as well as with yours, it was still mesg.ebay.de which got blocked.

In every test, the tab URL was https://www.ebay.de/mye/myebay/summary, and the target URL (which got blocked) was https://mesg.ebay.de/mesgweb/ViewMessages/0.

Next, I went to regex101, chose "ECMAScript / Javascript" and tested your pattern as well as my new pattern with the above URL. Both patterns were matching that URL in this test.

I am quite sure now that something more basic does not work as expected. Do we need to restart TC somehow to make it apply the new per-domain setting (I always just hit the blue "SAVE" button and then immediately tried the ebay site without further actions)? What else could go wrong? Can we make TC write a log file which I could publish here?

Binarus commented 1 year ago

I just conducted another test:

While the current URL (tab URL) was https://www.ebay.de/mye/myebay/summary, I typed https://mesg.ebay.de/mesgweb/ViewMessages/0 into the address bar and hit ENTER. That also didn't work: The new URL I had typed got opened in a new container.

stoically commented 1 year ago

But anyway this couldn't be the problem because of the tab URL and the target URL in the situation in question (see below).

Right, in that case your RE is sufficient.

Can we make TC write a log file which I could publish here?

Yep, as described in the issue template it's possible to get a debug log.

I just conducted another test

I did the same test and for me it does not isolate the request to mesg.ebay.de. Could you attach your TC preferences?

Binarus commented 1 year ago

I have made progress.

I have created a new per-domain isolation as described in my first post, but this time, I just used /.*/ as the domain pattern and the exclusion pattern. Even that didn't work. Again, when being on https://www.ebay.de/mye/myebay/summary and entering https://mesg.ebay.de/mesgweb/ViewMessages/0 in the address bar, the new URL was blocked and was opened in a new container.

That made me take a deep breath and review my global isolation settings. I had "Different from tab domain" everywhere. For a test, I switched that to "Different from tab domain and subdomains", and finally the problem with the ebay site was solved!

Now I would like to understand why my per-domain isolation rule did not override that setting. Could you please shortly explain?

In the meantime, I'll try to find out how to export the preferences and to create the debug log.

Thanks again!

Binarus commented 1 year ago

OK, here are the settings. Exporting them was a no-brainer :-) I had to change the extension from .json to .txt because otherwise the forum software doesn't allow to attach .json files.

temporary_containers_preferences_2022-10-19_23.52.21.txt

Binarus commented 1 year ago

And here is the debug log. Thanks for the detailed instructions on the page you linked! I cost me barely a minute to create the log.

console-export-2022-10-19_23-59-8.txt

Update - additional note: While creating the debug log, I still had the per-domain isolation with /.*/ as domain pattern and exclusion pattern enabled.

stoically commented 1 year ago

Thanks for the preferences and debug log. So, you indeed hit at least one bug there: per-domain isolation excludes don't take mouse clicks into consideration. So if any of the mouse click preferences match, then isolation happens despite excludes being configured. In your case it's safe to leave all mouse click configurations on Never and have just the navigation isolation on Different from tab domain, since navigation isolation also covers mouse clicks (those are navigations too).

I couldn't however reproduce the issue with manually entering the domain yet. If the issue persists after changing the mouse click isolation settings, then I'd need another debug log.

For a test, I switched that to "Different from tab domain and subdomains", and finally the problem with the ebay site was solved!

Yeah, with that setting configured you can freely navigate on subdomains of the same domain.

stoically commented 1 year ago

Global Different from Tab Domain & Subdomains navigation isolation and no mouse click isolation is what I personally use as well.

Configuring mouse clicks additionally can be helpful if a different behavior explicitly for mouse clicks is desired.

Binarus commented 1 year ago

Thank you very much!

I confirm that changing the mouse click settings to "never" makes the per-domain exclusions work as expected; I have just tested it, again using the ebay site as test object.

I let the navigation setting at "Different from tab domain" and changed the mouse settings according to your advice. As you have explained, that still didn't keep TC from working as expected in general even when navigating with the mouse.

Then I deleted the per-domain isolation and went back to ebay. As expected, it now opened each page with a different subdomain in a new container. Then I added the per-domain isolation again, using my second rule from above. Afterwards, I could use ebay to the my heart's contents without TC opening new containers. Entering a new URL with a different subdomain also didn't make TC open a new container any more.

Since I don't need a different behavior for normal navigation and mouse clicks, the problem is solved for me. However, I am looking forward to the fix of course :-) I don't have much experience with that, hence the dumb question: Since you have assigned the "bug" label, it makes sense to leave this issue open, correct?

A last remark about the regex you suggested:

/^https?://(.+\.)?ebay\.(de|com)(/.*)?$/ probably would match e.g. https://my.example.com/a.ebay.de and the like, which probably is not what is intended :-) I admit that this is paranoid, though.

Finally, a big, big thanks for TC! I am considering it the most important FF extension I have ever used.

Best regards,

Binarus

stoically commented 1 year ago

Glad it works for you.

I don't have much experience with that, hence the dumb question: Since you have assigned the "bug" label, it makes sense to leave this issue open, correct?

Yeah, makes sense in terms of knowing which unresolved bug issues exist.

/^https?://(.+.)?ebay.(de|com)(/.*)?$/ probably would match e.g. https://my.example.com/a.ebay.de and the like, which probably is not what is intended :-) I admit that this is paranoid, though.

Dots in URL paths are totally valid, e.g. think about downloading files. Also I don't see a difference between my (/.*)?$ and your /.*$ in terms of matching URL paths.

Finally, a big, big thanks for TC! I am considering it the most important FF extension I have ever used.

Thanks!

Binarus commented 1 year ago

/^https?://(.+.)?ebay.(de|com)(/.*)?$/ probably would match e.g. https://my.example.com/a.ebay.de and the like, which probably is not what is intended :-) I admit that this is paranoid, though.

Dots in URL paths are totally valid, e.g. think about downloading files. Also I don't see a difference between my (/.*)?$ and your /.*$ in terms of matching URL paths.

Dots in URL paths are valid; that's clear. However, we're having a problem here. We want to exclude ebay (and only ebay) from isolation. If we would use that regex, we would also exclude every URL as long as it has something like .ebay.de in the file path, even if the domain part is some evil domain which we really didn't intend to exclude.

Regarding your second remark: The key point is not the difference at the end of the regex. It is the difference at the begin. /^https?://(.+\.)?ebay\.(de|com)(/.*)?$/ matches every URL which has .ebay.de somwhere in its path (including the file part), because the (.+\.)? slurps every arbitrary domain name + path combination as long as it ends with a dot; that is, e.g. myevildomain.evil/superevil/. matches that part.

/^https?://([^./]+\.)*ebay\.(de|com)(/.*)*$/ doesn't have that problem because no slash (/) is allowed before ebay.(de|com) (please note the [^./]+ instead of the .+); that means that it forces ebay.(de|com) to be the last part of the domain name before the first slash which separates the domain from the file path.

As for the trailing part, you are right. In this context, (/.*)* (from my second, improved RE) should behave exactly as (/.*)?. But the /.*$ you refer to was from the RE in my first post and was wrong anyway because URLs may be domain-only without a trailing slash; you have brought this to my attention in your first post.

Best regards, thanks a lot, and have a nice Sunday!

Binarus

stoically commented 1 year ago

Oh, I see, you're right. Thanks for pointing that out – totally missed it. Nice Sunday for you too.