nareddyt / discover-rewards-notifier

A Chrome Extension that shows a notification when visiting sites that qualify for Discover® Deals or Cashback Rewards.
https://www.tejunareddy.com/discover-rewards-notifier/
GNU General Public License v3.0
6 stars 14 forks source link

Improve offer matching algorithm #66

Closed nareddyt closed 6 years ago

nareddyt commented 6 years ago

Closes #61

Problem:

Some sites, like https://www.underarmour.com/en-us/, are not displaying offers because the data did not have matching hostnames. For example, notice this item from the deals data has a different url:

  {
    "title": "Earn 10% Cashback Bonus online, Under Armour",
    "site_name": "Under Armour",
    "site_url": "www.uabiz.com",
    "deal_url": "https://card.discover.com/cardmembersvcs/deals/app/home#/deal/10446",
    "img_src_url": "https://www.discovercard.com/extras/logo/large/offer_3637_big_06092014.jpg",
    "expiry_date": "Ongoing"
  }

Root Cause Analysis:

Our original algorithm of googling for sites to determine hostnames works well in most cases, but not all cases. In this case, google was giving us the name of the business site for Under Armour instead of the retail site. We needed to rethink our matching algorithm, as outlined in #61.

Solution:

This change allows the extension to match on the tab title as well as the tab url. It will try to match on tab url first, as thats faster and more accurate. If that doesn't work, then it will try to match on tab title (ignoring case).

Note: Title matching doesn't work in all cases, which is why we try to match url first. For example, the J. Crew website actually has a title of J.Crew, but our deal data has J. Crew (with a space). We need a better string matching algorithm that ignores punctuation or whitespaces.

Note: Title matching actually takes longer than url matching. Turns out the tab title is not loaded until a lot later. When the tab first loads, the tab title is actually undefined or just the url of the tab. A few cycles later, the tab title actually gets populated. To get around this issue, the core matching logic is now repeatedly called if a tab is in an "undefined" state but there are no offers found yet.

Testing Done:

Extensive manual testing while looking at the logs.

nareddyt commented 6 years ago

Ignoring codacy issues to keep style consistent :/