eosnetworkfoundation / docsgen

Scripts for creating docs.eosnetwork.com pulls from various eos nf repos and builds static HTML to power website
Other
3 stars 0 forks source link

Update Crawler to use sitemap #42

Closed ericpassmore closed 2 years ago

ericpassmore commented 2 years ago

NOTE alternative is to use algoria crawler.

Edit crawler/getLinks.js to crawl https://docs.eosnetwork.com/sitemap.xml

Mostly likely want to crawl in groups by first level directory. For example param of group=cdt would crawl

Filter out all /blog do not crawl those Mostly likely want to crawl latest stable release, so you may need a version param as well.

Having our own crawler is faster and cheaper, but fragile.

ericpassmore commented 2 years ago

completed