Automatic proxy/tor/noscript/adblocker black- and whitelisting

Ambrevar commented 5 years ago

It would be nice if, when we disable/enable a mode for a URL, Next would remember it next time it loads that URL.

For instance, say you have noscript-mode enabled by default but you disable it for example.org. In a new buffer or new Next instance, it'd be nice to have Next automatically disable noscript-mode when loading example.org again.

(Note: to make it explicit, we need a separate status buffer / bar to show the list of active modes.)

There are a couple of ways to do this:

With a set-url-hook handler, we can set the modes for the current buffer loading a given URL. The modes are looked up in a (URL . modes) hash-table (which can be persisted to disk). This hash-table can be filled from the enable/disable hook of each mode. Doing this in hooks gives the user the flexibility to choose which mode status matters for the hash table (e.g. vi-insert-mode should probably not end up there).
Add "toggle-noscript" commands that black/whitelist the URL, toggle Javascript on the platform port but leave the mode active in the buffer. The mode would add a handler to set-url-hook. When disabled, the mode removes the hook. This way the user on has to add the mode to the list of default modes and then "teach" Next which website to black/whitelist with "toggle-noscript". This logic should be reflected on all modes then.

Thoughts?

Ambrevar commented 5 years ago

What about unnamed modes, for instance (proxy-mode :proxy (make-instance ...))? This can only be evaluate, it cannot be stored as a symbol in the buffer "default-modes" for instance.

Technically we could serialize the anonymous mode and restore it with cl-prevalence, but is it worth it?

vindarel commented 5 years ago

For instance, say you have noscript-mode enabled by default but you disable it for example.org. In a new buffer or new Next instance, it'd be nice to have Next automatically disable noscript-mode when loading example.org again.

No, this is dangerous for the user. Specially if you replace "noscript" by "tor/proxy".

Add "toggle-noscript" commands that black/whitelist the URL

toggle-noscript should toggle noscript temporarily and no more, we'd have another noscript-whitelist command that explicitely saves the state. Again, specially important with proxy mode.

The ability to call whitelist after toggle will be doable with a user hook, but it is the user's responsibility to add it. I don't expect it to be the default.

aartaka commented 4 years ago

I think this can be implemented with url-dispatchers once it finds its way into upstream. We can append new url-dispathing-handlers to url-dispatchers after asking the user whether they want to permanently enable the mode for this domain/host.

The only problem I see with this is that these url-dispatchers are black boxes -- it's hard to get their activation condition from them, and we can't check whether the handler for this domain and mode is already there.

Ambrevar commented 4 years ago

@vindarel highlights the importance of being explicit, lest the user privacy or security be compromised.

I suggest the following then: by default, before loading a page that has some automatic mode setup, Nyxt would prompt the user with the list of modes to enable for this buffer. This way the user controls at all times which modes are enabled.

We could add a per-match option to skip the prompt (if the user is confident that a given match is safe).

Ambrevar commented 4 years ago

@aartaka This needs to leverage the request-resource hook indeed, one way or another.

We need to focus on 2 aspects:

Storage: I suggest we have a new file, say auto-mode.lisp (to mirror Emacs' auto-mode-alist) that maps URLs to modes to include and modes to exclude. Exclusion happens first: this way it's possible to set the exact list of modes by first excluding all modes, then including only the desired modes.
User interface: I guess this can be implemented as an auto-mode-mode (poor name :p). This mode would hook onto mode deactivations and activations: when a mode is deactivated, it's added to the exclusion list, when it's activated, it's added to the inclusion list. The mode would also hook onto resource-query: Before a page load, we search the auto-mode.lisp file and on match, we query the user for the exact list of modes to enable for this page. On mismatch, we use the default list of modes without prompting.

Thoughts?

aartaka commented 4 years ago

I suggest the following then: by default, before loading a page that has some automatic mode setup, Nyxt would prompt the user with the list of modes to enable for this buffer. This way the user controls at all times which modes are enabled.

What do you mean by "automatic" here? Is it this auto-mode-enabled setup or the default setup from init.lisp?

In any case, prompting the user in the moment of mode activation/deactivation sounds better to me than promping them on every page load. Their default-modes list is set up in a way they want it to be on any resource load, except for the auto-mode enabled ones, so there's no need in asking whether they want to use what they've chosen themselves.

Asking right after mode toggling is a different case -- it's a conscious decision to alter the configuration. And the moment of mode switching is the moment to prompt the user about the persistence of change.

Ambrevar commented 4 years ago

What do you mean by "automatic" here? Is it this auto-mode-enabled setup or the default setup from init.lisp?

I meant when auto-mode-mode matches the loading page, it would set the appropriate modes.

In any case, prompting the user in the moment of mode activation/deactivation sounds better to me than promping them on every page load. Their default-modes list is set up in a way they want it to be on any resource load, except for the auto-mode enabled ones, so there's no need in asking whether they want to use what they've chosen themselves.

This is a good point.

So let's not hook onto mode (de)activation to remember the modes of a particular page. Instead, let's add a command for which the user can save the mode list for the current page. This command would show the list of modes in the minibuffer, and the users can add/remove those they want.

On page load, if there is a match, we would not prompt the user since the

We can also add a command to customize the current auto-mode-list, which makes the behaviour completely transparent to the user from the Nyxt interface.

Asking right after mode toggling is a different case -- it's a conscious decision to alter the configuration. And the moment of mode switching is the moment to prompt the user about the persistence of change.

This could also be an option. Could be annoying to some users though, not sure it would be very practical.

aartaka commented 4 years ago

This is a good point. So let's not hook onto mode (de)activation to remember the modes of a particular page. Instead, let's add a command for which the user can save the mode list for the current page. This command would show the list of modes in the minibuffer, and the users can add/remove those they want. On page load, if there is a match, we would not prompt the user since the We can also add a command to customize the current auto-mode-list, which makes the behaviour completely transparent to the user from the Nyxt interface.

This could also be an option. Could be annoying to some users though, not sure it would be very practical.

It seems we're different kinds of users in this regard: I prefer to have a detailed (and possibly annoying) dialogues with my machine, while you prefer to save things once. And this is a cool split of behaviors, because users are possibly split in the same way.

Maybe having some switch between behaviors can help there? I'm thinking of adding *prompt-auto-mode-on-mode-change* (not a good name, I'm open to discussion about it) special variable there.

Ambrevar commented 4 years ago

Artyom Bologov notifications@github.com writes:

Maybe having some switch between behaviors can help there? I'm thinking of adding *prompt-auto-mode-on-mode-change* (not a good name, I'm open to discussion about it) special variable there.

Absolutely.

aartaka commented 4 years ago

Storage: I suggest we have a new file, say auto-mode.lisp (to mirror Emacs' auto-mode-alist) that maps URLs to modes to include and modes to exclude. Exclusion happens first: this way it's possible to set the exact list of modes by first excluding all modes, then including only the desired modes.

Enabling modes for specific domains (i.e. "some.website") is useful, but it can be the case that user'd want to save modes only for some subdomain (e.g., auto-enabling blocker-mode on "ad-full-subdomain.some.website") and leave it as it is for all the other subdomains. That's why I see two options:

Store URLs as strings and match them to quri:render-uri-ed URLs with search, as it was with the old-reddit-handler in the manual before quri:uri migration (i.e. (search "reddit.com" url)). This string comparison is simple and intuitive, but it may fail in some exceptional yes important cases (like redirection links of the form "redirect.me/?to=reddit.com").
Store URLs with some keyword, denoting whether these are domains or hosts of URL that auto-mode needs to compare URLs to. So the format is
```
(((:domain "some.website") (...)) 
((:host "ad-full-subdomain.some.website") (blocker-mode ...)
...)`
```

Ambrevar commented 4 years ago

This is a very good point. Instead of using search, we could use regular expressions, it's the same code complexity and offers extra flexibility to the user.

I suggest to be as flexible as possible and leave it to the user how they want to match the auto-mode entry. So why not doing it like the URL dispatching handlers and add a predicate?

To extend on your proposal, I suggest we make the first element either a lambda or a string:

((match-domain "some.website") (...)
  (match-regex "reddit.*bla[hH]") (...)
 "http://full.uri/foo/bar" (...))

Thoughts?

aartaka commented 4 years ago

Yes, this sounds even better, because it can be passed right to url-dispatching-handler almost without additional processing!

aartaka commented 4 years ago

disable-hook and enable-hook are slots in the mode definition, so every mode has them isolated inside it. I guess this makes it quite hard to alter all the enable- or disable- hooks of all modes to the apptopriate auto-mode-connected saving handler.

Is there a way to iterate over all the default classes of modes and alter them? I see this change in default class-based hook values as the only way to implement this saving of enabled/disabled modes. If you have ideas about that, I'll be glad to see them!

Alternatively, we can forget about these enable/disable hooks. Having the auto-mode to be controlled by the user only, with commands like save-modes-for-{domain|host|url|regex}, and exclude-mode-for{domain|host|url|regex} can be a good MVP to build things on top of.

Ambrevar commented 4 years ago

Indeed, previously you convinced me not to use mode (de)activation automatically. https://github.com/atlas-engineer/nyxt/issues/376#issuecomment-654809991

Still, to answer your question, I think what we miss here is an all-mode activation / deactivation hook. It would be easy to add, should we ever need it.

Note: What is "MVP"? (I think it's good practice to avoid acronyms when possible :p)

jmercouris commented 4 years ago

MVP is minimum viable product :-)

aartaka commented 4 years ago

Indeed, previously you convinced me not to use mode (de)activation automatically. #376 (comment)

Yes, that's right. But in my use scenario (i.e. exhaustive dialogues with the machine) it can happen with the proper confirmation prompting as a precaution, so I planned to implement it for this use scenario.

Ambrevar commented 4 years ago

So do you think we should add a global mode (de)activation hook?

Ambrevar commented 4 years ago

If so, then you should add the global hooks to the buffer class and edit the define-mode macro to run them.

aartaka commented 4 years ago

So do you think we should add a global mode (de)activation hook?

Yes. At least this will make working with modes more universal. But it sounds like quite a breaking change.

Will adding the mode hook type and typing enable-mode-for-current-buffer-(after|before)-hook as mode hook make it easier and less breaking?

Ambrevar commented 4 years ago

There might be a misunderstanding.

Why would it be a breaking change? Adding a hook never breaks things.

Regarding your example hook, after|before what would the mode be enabled?

aartaka commented 4 years ago

Regarding your example hook, after|before what would the mode be enabled?

It'll be enabled after|before the enable-mode-for-current-buffer command that's already there :) These names of the hooks are generated by (define-command enable-mode-for-current-buffer ....

But I doubt retyping necessity now, because then it'll be an exception to the hooks generated by define-command. Making exception to otherwise good macro is not reasonable there. It's better to make a new hook. And it's not as bad as I thought it to be when I said it's a breaking change :)

aartaka commented 4 years ago

I came up with several functions to store the modes to auto-mode-list:

save-current-buffer-modes-for-url
save-current-buffer-modes-for-domain
save-current-buffer-modes-for-host
save-current-buffer-modes-for-regex
save-buffer-modes-for-url
...
save-current-buffer-modes-dwim
save-buffer-modes-dwim

And the last two basically make the previous long list of functions unnecessary, because they infer the match condition based on user input:

If it's a plain domain URL (i.e. "http://domain.com") -- then use match-domain
If it's host with subdomains ("http://subdomain.domain.com") -- use match-host
If it's a regexp??? (I'll explain this later) -- use match-regex
Use match-url otherwise

Same for the exclusion ones.

I have two questions here:

Do you see any need in this long and possibly confusing list of functions, when there's a couple of relatively intelligent function that Do What You Mean?
Is there a conventional way to know if a string is a regex? I know that everything is regex, but, I mean, is there a way to know if a string contains regex syntax? Do I need to write a possibly incomplete and buggy regex parser function for that? :)

aartaka commented 4 years ago

Do we even need match-url clause if match-regex includes all the cases of URL-matching and is much more powerful?

Ambrevar commented 4 years ago

Sorry for being a bit late to the party ;)

match-url is not superseeded by match-regex because regexes don't always match what you want. For instance, "foo?bar" is a legit URL path, but the regex will also match "fobar" which might not be what the user intended.

Of course you could escape the string, but that's brittle and it's hard to guess what the user intends.

I think it's safer to match the URI exactly from the command, and leave the regexp matches to the user hand-written rules.

About the do-what-I-mean-functions: I fully agree! In case of ambiguity, we can always raise a new minibuffer asking the user which methods they want to use.

aartaka commented 4 years ago

@Ambrevar, Should it be closed now, when #846 is closed and #868 is a place for more focused discussion?

Ambrevar commented 4 years ago

Indeed, thanks for the reminder!

atlas-engineer / nyxt

Automatic proxy/tor/noscript/adblocker black- and whitelisting #376