hmol / LinkCrawler

Find broken links in webpage
MIT License
117 stars 59 forks source link

In Console output, when finished -> write elapsed time #19

Open timstarbuck opened 8 years ago

timstarbuck commented 8 years ago

Nice project! I added a counter and Interlocked.Increment/Decrement and a loop to wait for completion. Then I added an IOutput method to write a string (only implemented in Console output).

Seems you could also raise an event so the driving program knows its finished.

Let me know what you think!

linkcrawlerelapsed

loneshark99 commented 8 years ago

@timstarbuck

How about this way of doing it, do you see any differences?

https://github.com/hmol/LinkCrawler/pull/18/commits/f4f3b35822d3abe3bf9c8bc47a95f19f207a82f7

You can pull from here.

https://github.com/loneshark99/LinkCrawler

timstarbuck commented 8 years ago

I guess it depends on what we are trying to actually time :)

Showing the elapsed time of each request seems beneficial. I took a different meaning. I took "when finished" to mean when the site crawl has completed crawling all the links.

loneshark99 commented 8 years ago

@timstarbuck true

hmol commented 8 years ago

Hey, and thanks for contributing to my repository here :smile: I have tested your code and it works. It may be because I'm not used to this type of code, but I get a feeling that this code is a bit "hacky"

while (counter > 0)
{
    Thread.Sleep(100);
}

And the problem is that I don't really know (right now) how to do it in another way, but I had something like this in mind: http://stackoverflow.com/a/25010220. I dont think we can use this here, because then we would need to have all the urls to crawl at start. Without any research I have also thought about solving this by implementing some sort of queue that you would add urls to, and in the same time the program would get urls from the queue and crawl. What do you think?

timstarbuck commented 8 years ago

Ha, yes. I felt it was a bit "hacky" as well, but as you mentioned, it works ;). Its a bit of a chicken and egg problem. You don't know how many links there are to crawl until they've all been crawled. I'll ponder your notes some and see if I can think of another way.