netarchivesuite / so-me

Social Media harvests
Apache License 2.0
8 stars 0 forks source link

Bundle webarchive_discovery config #3

Open tokee opened 6 years ago

tokee commented 6 years ago

Currently the default config reference.conf from webarchive_discovery is used for indexing. Some of the settings leads to quite a heavy indexing workflow (facial detection, PDF validation). Instead of doing multiple tweaks to the default config file, we should bundle our own.