Smile4ever / firefoxaddons

Extend the functionality of Firefox with cool addons
82 stars 20 forks source link

[Neat URL] Yet another Google parameter... #25

Closed nicolaasjan closed 6 years ago

nicolaasjan commented 6 years ago

Today I noticed that Google uses a new(?) parameter: gs_l Example: https://www.google.nl/search?source=hp&q=Mozilla+Archive+Format&oq=Mozilla+Archive+Format&gs_l=psy-ab.3..0l2j0i22i30k1l2.1148.1148.0.4681.1.1.0.0.0.0.83.83.1.1.0.foo%2Cnso-ehuqi%3D1%2Cnso-ehuui%3D1%2Cewh%3D0%2Cnso-mplt%3D2%2Cnso-enksa%3D0%2Cnso-enfk%3D1%2Cnso-usnt%3D1%2Cnso-qnt-npqp%3D0-1701%2Cnso-qnt-npdq%3D0-54%2Cnso-qnt-npt%3D0-1%2Cnso-qnt-ndc%3D300%2Ccspa-dspm-nm-mnp%3D0-05%2Ccspa-dspm-nm-mxp%3D0-125%2Cnso-unt-npqp%3D0-17%2Cnso-unt-npdq%3D0-54%2Cnso-unt-npt%3D0-0602%2Cnso-unt-ndc%3D300%2Ccspa-uipm-nm-mnp%3D0-007525%2Ccspa-uipm-nm-mxp%3D0-052675...0...1..64.psy-ab..0.1.83.jV4WLrHrkAI Perhaps you can add it to your default filter?

Thanks for your add-on! Pure URL does not work anymore...

Smile4ever commented 6 years ago

Hi,

The gs_l parameter is not very new (2016), but I never saw it before :) https://productforums.google.com/forum/#!topic/webmasters/UJMhujdXgbE

I've added gs_l to Neat URL 1.2.0. You will automatically get this new parameter if you've not already added it manually.

The commit that implements Neat URL 1.2.0 can be found here: https://github.com/Smile4ever/firefoxaddons/commit/9a665747e2fc25d35a9c666af58918d424890449

Neat URL 1.2.0 will soon be available on addons.mozilla.org.

Thanks for your add-on!

You're welcome.

EC-O-DE commented 6 years ago

Yeah I just noticed I dunno if it's this or what but Google Images doesn't give direct link to images anymore... :(

Smile4ever commented 6 years ago

Try to disable Neat URL and try again. If it works after disabling Neat URL I will retract gs_l from the parameters list. If it doesn't work, gs_l can stay.

By the way, which version of Neat URL are you using, @ZenFi? Does it even have the gs_l parameter?

nicolaasjan commented 6 years ago

Google Images works OK here. Clicking on a image result gives you the black frame where you can click on "view image" and that works well here (I manually added the gs_l parameter).

nicolaasjan commented 6 years ago

By the way, when using Google images and search for lets say "mozilla" I get: https://www.google.nl/search?as_st=y&tbm=isch&hl=nl&as_q=mozilla&as_epq=&as_oq=&as_eq=&imgsz=&imgar=&imgc=&imgcolor=&imgtype=&cr=&as_sitesearch=&safe=images&as_filetype=&as_rights= But when I then specify within the results the size "large" I get this url: https://www.google.nl/search?q=mozilla&as_st=y&hl=nl&tbm=isch&source=lnt&tbs=isz:l&sa=X&ved=0ahUKEwiAhrj54-fVAhVFCcAKHWZOAFYQpwUIHQ&biw=1444&bih=905&dpr=1 Notice the ved parameter! See also: https://moz.com/blog/inside-googles-ved-parameter (more Google tracking...) Adding ved to the filter list does not give me any nasty side effects. So maybe add this to the default as well?

eXqusic commented 6 years ago

What about all of amazons parameters?

https://www.amazon.com/Spigen-RA200-Earhooks-Earphones-Headphones/dp/B01NAM69IJ/ref=pd_sim_107_3?_encoding=UTF8&pd_rd_r=T9EY6TZZ0KF4V86SSGGD&pd_rd_w=3FbbK&pd_rd_wg=qDgGV&psc=1

Everything past "/ref=" isnt needed

So.. just to list, pdsim pdrd psc

thats just from that one link, theres more haha

Smile4ever commented 6 years ago

I have the intent to implement parameters without full domain, like "ved@google.*". This will allow for specific parameters on multiple domains (but not all).

I will implement these parameters:

ref=pdsim* is harder to implement, but I might find a way to do it.

eXqusic commented 6 years ago

pdrd* would be better, there is more then just those few you mentioned.

Geobert commented 6 years ago

Not a Google parameter but a tracking parameter none the less:  http://www.futura-sciences.com/planete/actualites/paleontologie-vie-dodo-retrouvee-os-68360/#xtor=RSS-8

xtor=RSS-8 should be removed (tried to add xtor, #xtor and event /#xtor in the options with no luck)

Thanks for this extension!

nicolaasjan commented 6 years ago

Two other Google parameters: ei and sei Found as follows: In Google advanced search, search for lets say "Remove garbage from URLs". I get: https://www.google.com/search?lr=&hl=nl&as_qdr=all&q=%22Remove+garbage+from+URLs%22&oq=%22Remove+garbage+from+URLs%22 Then click on the coloured "Go to the Google homepage" link in the upper left corner. There I get: https://www.google.nl/webhp?hl=nl&sa=X&gws_rd=cr&ei=nvSiWfaROYiNUbyTlpAG The part ei=nvSiWfaROYiNUbyTlpAG contains a Unix timestamp and is often used in digital forensics... See: https://cheeky4n6monkey.blogspot.nl/2014/10/google-eid.html

While it doesn't seem to occur for every search, when it does, that "ei" parameter contains an encoded Unix UTC timestamp (and other things Google only knows). Interpreting this artifact can thus allow forensic analysts to date a particular search session.

When running his Python script in my Linux terminal I get:

python google-ei-time.py -u "https://www.google.nl/webhp?hl=nl&sa=X&gws_rd=cr&ei=nvSiWfaROYiNUbyTlpAG"
Running google-ei-time.py v2014-10-10

URL's ei term = nvSiWfaROYiNUbyTlpAG
Padded base64 string = nvSiWfaROYiNUbyTlpAG
Extracted timestamp = 1503851678
Human readable timestamp (UTC) = 2017-08-27T16:34:38

See also: http://kb.digital-detective.net/display/NetAnalysisV2/URL+Analysis#URLAnalysis-GoogleEI/SEIParameterDecoding

I can't remember where I saw the sei parameter, but it appears to be something similar.

First I only added ei and sei to the add-on settings, but for some reason unknown to me, YouTube got broken (video's did not play). :anguished: As I only encountered the issue at google.nl, I had to add:

Smile4ever commented 6 years ago

I worked on this. This is a status update to keep you all informed.

Done:

(amazon.* is a wildcard for amazon.de / amazon.com / amazon.fr ...)

Still TODO:

Please note that the above parameters won't work in Neat URL 1.2.0. An update will be provided shortly with support for wildcard domains. These new parameters will be added by default when upgrading users to the updated version.

Smile4ever commented 6 years ago

I have implemented everything from above, except wildcard support for parameters. I added that to the TODO list.

It will be available in Neat URL 2.0.0: https://github.com/Smile4ever/firefoxaddons/commit/ff5fd890ed790a0e8d05081f2b1cbeef8f8358ce

(please ignore 1.5.0 in the CHANGELOG, it became 2.0.0)

Neat URL 2.0.0 has been submitted to addons.mozilla.org for approval. It will soon be available to end users. I will inform you when that happens.

Geobert commented 6 years ago

Lean URL? or Neat URL?

Thanks for your work!

Smile4ever commented 6 years ago

I was sleepy. Neat URL of course.

GitCurious commented 6 years ago

Hello - thanks for the addon, I`m testing it now

**Everything after /ref on amazon.** ($/ref@amazon.)

This actually breaks certain links on Amazon, for example;

"Track Package" and "Cancelled Items"

there may be more but I have just noticed those two immediately.

An example link segment: amazon.co.uk/gp/your-account/ship-track/ref=xxx?ie=UTF8&itemId=xxx&orderId=xxx&shipmentId=xxx

the 'ref' parameter is stripping away customer specific item information after it.

Smile4ever commented 6 years ago

@GitCurious This bug is fixed in Neat URL 2.0.1. It will soon be available on addons.mozilla.org.

Everyone: By the way, go grab Neat URL 2.0.0 (or up) on addons.mozilla.org! 😃