gwu-libraries / sfm-ui

Social Feed Manager user interface application.
http://gwu-libraries.github.io/sfm-ui
MIT License
153 stars 25 forks source link

Document how to do image extraction in processing container #753

Open lwrubel opened 7 years ago

lwrubel commented 7 years ago

JWAT tools and warctools are provided in the processing container. Investigate which can be used to extract images. Document the steps for doing this.

justinlittman commented 7 years ago

Also look at https://github.com/chfoo/warcat. May want to add as an available tool.

kerchner commented 7 years ago

This can be addressed after the issue of heretrix web harvesting failing is addressed

justinlittman commented 7 years ago

See also https://github.com/jaygattuso/WARC_dumper.