mst / whatsupcoming

2 stars 0 forks source link

Looking up normalized venue data on save breaks the current duplicate check #13

Closed mst closed 12 years ago

mst commented 12 years ago

The web parser uses django's get_or_create to check if the exact object is already in the database. If the get fails, we save the object. However, when modifying the object before saving, we only get a match if we tried to store e.g. a location, that already matches of the google places' data. That never happens since we would need the geo information (latitude, longitude) for that, which is why we try to modify the object in the first place.

My suggestion is to move the geo information lookup to the data scraping. We should have two additional classes:

  1. general duplicate check
  2. geo information lookup

these two classes are called by any parser.

mst commented 12 years ago

moved the geo location lookup out of the models module. added a base class for the parser which includes a simple duplicate check