iplanwebsites / newtab-bookmarks

Chrome extension replacing the default startpage with your bookmarks.
6 stars 3 forks source link

New data fetching strategy #5

Open SBoudrias opened 11 years ago

SBoudrias commented 11 years ago

I've been looking at some solution for data-fetching and I think our best options would be to use Event page in the chrome app to launch the fetching of the bookmarks in the background.

This would be good as we could run the background script at different moment:

We'll then only need to checkout when would be the best time to arbitrary fetch data in ordre to keep the bookmarks collection up to date (deleting bookmarks no more present in an account, adding new ones etc). I'm not sure yet about what the strategy at this level should be.


Architecture for this page should be an other HTML page loading requirejs with a special config/build. I'm pretty confident we could reuse most of the models/collection modules we already have (once they'll be separated from their instances; see issue #6).

And we'll need to separate 3rd party site data fetching in separate module (right now they mostly live in the Bookmark collection)

iplanwebsites commented 11 years ago

Totally agree. Should we use the background page, and simply trigger fetching at given moments (scheduled or on demand?). In all case, I'd heavily throttle all network activity to minimize bandwidth impact for site indexing.

SBoudrias commented 11 years ago

I just extracted the Chrome bookmarks fetcher from the collection here: https://github.com/iplanwebsites/newtab-bookmarks/blob/master/app/modules/bookmarks.chrome.js

The public interface for a fetcher will be pretty simple. It just need to have a fetch method who return a jQuery deferred object. This fetcher object take care of updating the collection by adding, removing and updating models depending on the action to take.

Right now the Chrome fetcher is still called from the main app; but by being separated like this, we just have to change the place it get called to an event page and it will work right out of the box.

SBoudrias commented 11 years ago

Added delicious fetcher in commit 93017bfa0dfb46797d55f6e19bf3e0be7bba731e

I'm wondering how we should manage "duplicate" bookmarks (what if a bookmark is present in both twitter and delicious). Which title attribute should we keep?

And right now only one type is allowed, but in reality a bookmark could be multiple type. Right now the last one fetched win (so there's no priority management ATM).

Maybe we should clearly separate each incoming data source:

{
  twitter_title: "Something",
  twitter_tags: "tag1 tag2",
  delicious_title: "Something else",
  delicious_tags: "bla bleug"
}

Then we just set priority order: "Chrome", "twitter", "delicious", etc...

Let me know what you think

iplanwebsites commented 11 years ago

Great!

It'll be interesting to view bookmarks on a per-source basis. So we should definitely keep both all instances. It's however redundant to see it multiple time in the same view. (why would stuff I post on twitter remove bookmarks from my chrome view?)

However, I'd only show the latest-added instance when the same URL is there multiple time. For instance, if I tweeted a link that I also bookmarked a year ago, this shouldn't be buried along all my old bookmarks. Maybe our unique key should be a composite of Source + URL and not strictly the URL?

Having a consolidated format as you suggested (with all the possible sources in the object) is the cleanest approach,but it might complexity the search a little bit as we'll need to parse all the vendor's attributes when searching or comparing. Maybe we can generate our own tag-array based on all of the vendor ones? That'd give us the best of both world.

F

On 2013-01-12, at 5:41 PM, Simon Boudrias wrote:

Added delicious fetcher in commit 93017bf

I'm wondering how we should manage "duplicate" bookmarks (what if a bookmark is present in both twitter and delicious). Which title attribute should we keep?

And right now only one type is allowed, but in reality a bookmark could be multiple type. Right now the last one fetched win (so there's no priority management ATM).

Maybe we should clearly separate each incoming data source:

{ twitter_title: "Something", twitter_tags: "tag1 tag2", delicious_title: "Something else" delicious_tags: "bla bleug", } Then we just set priority order: "Chrome", "twitter", "delicious", etc...

Let me know what you think

— Reply to this email directly or view it on GitHub.

SBoudrias commented 11 years ago

So, if I understand correctly, you'd suggest we set the data priority (the data displayed in the bookmark view) to the newest source added?