Princeton-CDH / derrida-django

Derrida's Margins - Python/Django web application
https://derridas-margins.princeton.edu
Apache License 2.0
8 stars 1 forks source link

revise crawl script to create index and add to git #293

Closed rlskoeser closed 3 years ago

rlskoeser commented 3 years ago

implemented in https://github.com/Princeton-CDH/cdh-ansible/pull/105

confirmed that the most recent crawl includes base sitemap, 404, and 500 pages

browsertrix does not seem to parse and crawl sitemaps