john-hu / untitled

0 stars 0 forks source link

Peeler parsed website revisit policy #53

Open john-hu opened 2 years ago

john-hu commented 2 years ago

We already visited some websites. The visited URLs are all stored at a tiny sqlite3 db. We should build a revisit policy to know if there is any update on the data, like:

  1. save the last visited at the sqlite3 db
  2. list URLs which is not parsed or last visited time is longer than 3 month
  3. locks the URLs
  4. parse the website.