nasa-jpl-memex / memex-explorer

Viewers for statistics and dashboarding of Domain Search Engine data
BSD 2-Clause "Simplified" License
121 stars 69 forks source link

Getting images from the crawl #714

Closed saloneerege closed 8 years ago

saloneerege commented 8 years ago

I am trying to crawl weapons images and since the nutch that memex uses has the mimetypes.txt to only accept text/html will I have to add images/* there so that it accepts all images as memex crawls through the seedlist ?

ahmadia commented 8 years ago

Hi @saloneerege. That sounds right to me. @chrismattmann - do you have some thoughts on this?

chrismattmann commented 8 years ago

correct