hellock / icrawler

A multi-thread crawler framework with many builtin image crawlers provided.
http://icrawler.readthedocs.io/en/latest/
MIT License
857 stars 174 forks source link

Adding support for images extensions #6

Closed Lightjohn closed 8 years ago

Lightjohn commented 8 years ago

Hi, now we guess the extension from the url. I take care of removing params and left a default value because baidu had some cases with no extensions.

Lightjohn commented 8 years ago

I found some strange cases, so here an update, with more robust detection and your modification.