WebMemex / webmemex-extension

📇 Your digital memory extension, as a browser extension
https://webmemex.org
Other
208 stars 45 forks source link

Added : import-bookmarks.js #47

Closed wimpykid26 closed 7 years ago

wimpykid26 commented 7 years ago

Will import all browser bookmarks and store them in db.

Fixes #29

wimpykid26 commented 7 years ago

@Treora @oliversauter wrote a new file which is somewhat similar to import-history.js. I am having a great deal of trouble in testing/debugging it. Could you please guide me on how do it. The console is cluttered up when the overview.html opens therefore its of no use to create a button there and call it. Thanks in advance :)

Treora commented 7 years ago

Great you started on this.

By the way, if a lot of code is the same as in import-history.js, we could consider factoring those parts out or exporting them from either module so they can be shared.

As for debugging, add import './browser-bookmarks/import-bookmarks' to src/background.js, so that the file is included in the background. You should get a completely blank console if you click 'inspect views: background page' in the list of extensions. There you could test your functions if you add them to window as described before, e.g. window.importBookmarks = importBookmarks.

wimpykid26 commented 7 years ago

@Treora . Made some changes and debugged the code. Its working perfectly fine with custom inputs as bookmarks. browser.bookmarks.getTree() and browser.bookmarks.getRecent() return null object arrays when called. Even after bookmarking various sites in the browser that opens while running the extension. Cannot seem to get my head around that. Please review. Thanks :)

blackforestboi commented 7 years ago

Hey @wimpykid26

Don't take my word for it, but when I implemented 'browser.bookmarks.getRecent()' I had to add a few parameters. Here you see the old code, where I imported the bookmarks: https://github.com/WorldBrain/Research-Engine/blob/master/src/js/import_bookmarks.js#L11

Looking into the API documentation it says:

var gettingRecent = browser.bookmarks.getRecent( numberOfItems // integer )

numberOfItems A number representing the maximum number of items to return. The returned list will contain up to this many of the most recently added items. The minimum allowed value here is 1. If you pass 0 or less, the function will throw an error.

Hope that helps! Oliver

wimpykid26 commented 7 years ago

Really appreciate the review @oliversauter.

I did actually pass an integer inside the function call. Still it didn't return anything. There is nothing wrong with the function call. It's just not returning anything. Probably it's not keeping track of the bookmarks in the browser its opening after fx-run

wimpykid26 commented 7 years ago

I'll look into it some more and keep you updated . Thanks :)

wimpykid26 commented 7 years ago

screenshot from 2017-03-07 19-09-12 Checked if url is valid via isWorthRemembering converted valid bookmarks to allDocs (pageDocs + visitDocs) stored them in db.

Please review @Treora @oliversauter

Treora commented 7 years ago

@wimpykid26: Either I have not understood what you are doing, or there is still a load of unnecessary complexity in there which would take much more time to point out than to rewrite (sorry to be harsh), so it does not seem ready for review. As said before, I think the tree should not be necessary at all, you could just getting the last zillion recent bookmarks with browser.bookmarks.getRecent and then do nearly exactly the same things the history import does, probably sharing some code with it too.

wimpykid26 commented 7 years ago

@Treora Apologies for the complicated code. Removed the logtree function (which I had used for browser.bookmarks.getTree() but later forgot to remove) and imported reusable functions from import-history.

wimpykid26 commented 7 years ago

@Treora how do I set the nonce differently. getVisitItemsForHistoryItems eventually finds the last visit of a bookmarkItem and returns a historyItem therefore only historyItems are getting passed into generatePageDoc. For retrieving the visitItems the bookmarkItem has to be passed through getVisit...... . Instead should I bypass for bookmarks and get the dateAdded property for its last visit time?.

Treora commented 7 years ago

I'm sorry I have not yet got around to follow up on this; other priorities.. It would be nice to continue with the last refactoring one of the next weeks, though as said it needs a bit of thought what the desired result actually is. It should become a bit easier to refactor after PR #81, which converts the promise chains to async/await syntax and in the process also refactors history import code a bit.

poltak commented 7 years ago

I've been working on getting up-do-date with all these import-history/bookmarks + fetch/analyse of pages issues for the WorldBrain fork (the ones all mentioned here). import-bookmarks seems like a good place to start, as it shares a lot with the existing import-history code, and some progress has already been made here.

At the moment, looking over the @wimpykid26's code, the import-bookmarks stuff creates visit and page docs same as was done in import-history. If my understanding is correct, after reading through #29, the main difference here should be that we'll want to save some kind of bookmark doc instead of a visit doc. Is my understanding correct?

So:

If so, let's decide some structure for bookmark docs? Visit doc structure currently looks like:

{
  _id, 
  url, // URL of the visited page
  visitStart, // timestamp of when the site was visited
  importedFromBrowserHistory, // timestamp of when the import happened
  referringVisit: { _id }, // optional ID of other visit doc deemed to be referrer for this visit
  page: { _id }, // ID of the page doc which this visit is associated with
}

Bookmark doc structure could possibly look like:

{
  _id, 
  url, // URL of the bookmarked page 
  bookmarkStart, // timestamp of when the site was bookmarked
  importedFromBrowserBookmarks, // timestamp of when the import happened
  page: { _id }, // ID of the page doc which this bookmark is associated with
}

Do we need any additional information about bookmarks? At the moment, according to discussion in #29, we're leaving out things like folder structure. Only difference now is there's no optional referrer visit reference, because it's not applicable here, and renaming of certain atts.

An additional unrelated question: When will the url in a visit doc and the url in that visit's associated page doc differ? All the URLs in the associated docs i'm looking at in my local pouchDB seems to be the same.

Treora commented 7 years ago

As most of the discussion is not specific to this PR but referring to the issue discussion, I replied in issue #29 to keep things together.

Treora commented 7 years ago

Closing due to inactivity. A more likely implementation for this feature is being made by @poltak over at WorldBrain (see https://github.com/WorldBrain/WebMemex/pull/25).