nexcess / magento-turpentine

A Varnish extension for Magento.
GNU General Public License v2.0
519 stars 253 forks source link

Better crawler status visibility #44

Open aheadley opened 11 years ago

aheadley commented 11 years ago

Currently the only way to see what the crawler is doing is to turn on debugging and watch the system log. Need something better. Also need a way to re-crawl URLs after they expire from Varnish, and a way to force a full re-crawl without flushing the cache.

janiscaunecm commented 11 years ago

Also need a way to re-crawl URLs after they expire from Varnish, and a way to force a full re-crawl without flushing the cache. - currently I'm getting reasonable results by using a custom sitemap instead of Turpentine helper methods, and with a cronjob that flushes URL queue cache daily. Having a sitemap generated daily and setting crawler to launch every hour or so allows to have most URLs cached. Also, separating URL queue source from Turpentine allows to include 'not native' URLs like some blog pages etc.

donnie-darko commented 11 years ago

Can you provide some detail about the 'custom sitemap' you use with the turpentine crawler? And I appreciate some info about the flush cronjob. I am very interested in :)