greasyfork-org / greasyfork

An online repository of user scripts.
https://greasyfork.org
GNU General Public License v3.0
1.47k stars 441 forks source link

Wrong domain listed in "Applies to" #936

Closed Procyon-b closed 3 years ago

Procyon-b commented 3 years ago

The userscript is https://greasyfork.org/scripts/425053 As you can see in the screenshot, co.a is listed as a valid domain:

greasyfork-tld

But the @include headers are:

// @include      https://www.google.tld/
// @include      https://www.google.tld/?*
// @include      https://www.google.tld/webhp*
// @include      /^https:\/\/www\.google\.co\.[^.]+\//
// @include      /^https:\/\/www\.google\.co\.[^.]+\/\?.*/
// @include      /^https:\/\/www\.google\.co\.[^.]+\/webhp.*/

As a suggestion related to the Applies to list. Why not give the possibility to the author to provide a list of valid urls/domains?Then GF can match them against the @match and @include rules in the script/style, and list them if they pass.

JasonBarnabe commented 3 years ago

When calculating the domains to display, any regexps will be reversed using some fairly rudimentary logic. Reversing is essentially trying to come up with a string that would match the regexp.

[^.]+ is reversed to a. Why a? It doesn't understand the context here, and a is as good as any other character.

So the regexp gets reversed to something like https://www.google.co.a/. It then determines the eTLD+1 of this URL, which is co.a. (It does understand that co.uk is an eTLD, but co.a doesn't exist.)

Why do you include the regexps in addition to the .tld entries?

Procyon-b commented 3 years ago

Why do you include the regexps in addition to the .tld entries?

There is a bug in tampermonkey that forgets some top tlds in the co.?? form (not the first time. I filed a bug report twice the past year or two). I noticed it recently with this particular script (testing it on doodles from https://www.google.com/doodles)

This is an attempt to fix this for the future, or for any other script manager that could potentially have the same problem.

JasonBarnabe commented 3 years ago

As you can see, Greasy Fork turns .tld into a short list of TLDs as it would not be terribly helpful to list every TLD in existence. If there are particular TLDs you want listed, you can adjust your regexp to explicitly list it (/^https:\/\/www\.google\.co\.(xx|yy|zzz)\//).

This bug is solvable in this case but doing so would break other cases, so I'm not sure there's anything I can do.

Procyon-b commented 3 years ago

The previous bugs where not about uncommon tlds: https://github.com/issues?q=is%3Aissue+author%3AProcyon-b+archived%3Afalse+sort%3Aupdated-desc+is%3Aclosed+tampermonkey+tld+.co.

It's not really a problem for me. I'm not fixating on it. ;)