ushahidi / SwiftRiver-Core

SwiftRiver Core Applications
6 stars 3 forks source link

Fix the media extractor ignore bookmark images andURLs linking to ads #3

Open ekala opened 12 years ago

ekala commented 12 years ago

The link extractor needs to filter out links to ads (e.g. http://feedads.g.doubleclick.net) a la AdBlock. Perhaps also extend the same to allow specifying the links to be ignored using wildcards

69mb commented 12 years ago

Think we need a new service 'spamassassin' to filter out ads from droplet content. Should be a pre-processor that plugs in to droplets before they are added to the db.

ekala commented 12 years ago

We also need to strip out these kinds of images - http://i.imgur.com/dpaey.png

69mb commented 12 years ago

We can use Adblock Plus's easylist to clean up the drop.