I added an extra command a while ago because i wanted to scrape the metadata for all galleries so i could hammer my own local db without getting banned. Therefor i added a fullsync command which will iterate through all galleries pages and filter on id to check if the gallery has already been indexed.
For (my own) convenience, i have added the cookiefile package as a dependency so i could easily export cookies from chrome in a netscape formatted file and use that (using cookies.netscape file).
The new fullsync command also uses the newly introduced uriCallInterval and startPage config keys.
uriCallInterval is the sleep timer in second between EACH http call and startPage is the offset from which page the scraping should be started, in case the script crashes on page 700 or something.
Splitted common sync logic into a basesync class for reuse.
Added fullsync command (with configurable sleep interval and page offset)
I added an extra command a while ago because i wanted to scrape the metadata for all galleries so i could hammer my own local db without getting banned. Therefor i added a
fullsync
command which will iterate through all galleries pages and filter on id to check if the gallery has already been indexed.For (my own) convenience, i have added the cookiefile package as a dependency so i could easily export cookies from chrome in a netscape formatted file and use that (using
cookies.netscape
file). The newfullsync
command also uses the newly introduceduriCallInterval
andstartPage
config keys.uriCallInterval
is the sleep timer in second between EACH http call andstartPage
is the offset from which page the scraping should be started, in case the script crashes on page 700 or something.