store in indexdb/leveldb what is indexed

eklem commented 6 years ago

Check out this example. Should store each ID of what is indexed. Store the ID after the add-step.

eklem commented 6 years ago

search-index-housekeeper will take care of this.

eklem commented 5 years ago

Rename search-index-housekeeper to browsercrawler-housekeeper. It should be set up to:

add new URL's from data fetched to a "what do crawl"-list. This list
do something with the data fetched (i.e. add to search-index)
register that the URL has been done something to (i.e. added to search-index)
register that the URL has been crawled w/timestamp in the "what to crawl"-list

This way, there won't be any gaps in the housekeeping on what has been crawled, even if the process is interrupted by clicks to new pages. You will maybe get some overlapping crawling every now and then, but that's not a problem.

eklem / browsercrawler

store in indexdb/leveldb what is indexed #35