cpp-netlib / cpp-netlib

The C++ Network Library Project -- cross-platform, standards compliant networking library.
http://cpp-netlib.org/
Boost Software License 1.0
2k stars 427 forks source link

Develop an example using the HTTP client as a web crawler #108

Open glynos opened 12 years ago

glynos commented 12 years ago

A good example of the HTTP client would be as a web crawler. This could also be a good demonstration of the flexibility of the URI class.

deanberris commented 12 years ago

This sounds like a good idea. However I'm afraid of the effort that would need to go into something like this. Parsing HTML is scary stuff and finding the URL's and interpreting rel="nofollow" in a DOM along with the myriad other link elements in a document is a tall order.

That said, I'd be willing to accept contributions to this effect.

glynos commented 12 years ago

My motivation for opening this issue is to start generating more complex and interesting examples for the HTTP client. At this stage I am not worried about the complexity of the HTML parsing.

tex commented 6 years ago

Perhaps I'm rising a zombie, but nonetheless, I created this example https://github.com/tex/cpp-netlib-example Would you consider it as a good example of using cpp-netlib as a simple web crawler?