fimad / scalpel

A high level web scraping library for Haskell.
Apache License 2.0
323 stars 43 forks source link

(closes #11) Add scrapeUrlWithConfig #15

Closed fimad closed 8 years ago

fimad commented 8 years ago

Adds a new more extensible method for scraping URLs. This function takes a record type that can be extended in the future to further modify the behavior of scraping URLs.

Besides the list of curl options, a new decoder option is added. The decoder is a function that takes a curl response and turns it into a string-like type. The default decoder attempts to infer the correct encoding from the Content-Type header. If no character set is given then it assumes ISO-8859-1.

There are also decoders that force UTF-8 or ISO-8859-1 for use with websites that do a poor job of declaring their character sets.