searx / searx

Privacy-respecting metasearch engine
https://searx.github.io/searx/
GNU Affero General Public License v3.0
13.41k stars 1.71k forks source link

tracking/affiliate link remover plugin #316

Open pointhi opened 9 years ago

pointhi commented 9 years ago

I would propose a plugin idea, to remove known tracking and affiliate links from the result url.

For example, amazon urls sometimes containing the url argument tag which is an affiliate-id. We could improve that list with other known url arguments of sites, which are only used for tracking and affiliat, and create a plugin which is deleting those parts of the url.

dalf commented 9 years ago

+1

Cqoicebordel commented 9 years ago

Awesome idea !

privacytoolsIO commented 9 years ago

I can help with that. I'm a big affiliate marketer. A website of mine is currently participating in over 3000 affiliate programs.

dalf commented 9 years ago

@privacytoolsIO : great :-)

If you can give a list of website and affiliate parameters / patterns, it would be awesome.

The implementation will be something that look like the https_rewrite plugin, line 225

Cqoicebordel commented 9 years ago

@privacytoolsIO If you can provide me a list of trackers arguments in URL, or anything similar, it would be welcome to enhance what I already commited #365 :o)

privacytoolsIO commented 9 years ago

@Cqoicebordel Sorry, I'm late. Just found this: https://s3.amazonaws.com/lists.disconnect.me/simple_ad.txt

This is used by the plugin "Disconnect" and should cover most of ads / affiliate links.

Cqoicebordel commented 9 years ago

Nice ! This list could be very useful to remove unwanted links !

But I was talking more about the arguments in the URLs, like ?utm_origin=XXX etc. If you know about others than 'utm' it would be very useful. If you don't, don't worry :)