golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.31k stars 17.7k forks source link

tour: Add tests for web crawler task #14480

Open amorgun opened 8 years ago

amorgun commented 8 years ago

Context: https://tour.golang.org/concurrency/10

I think web crawler task is harder than it looks. I saw a lot of solutions and all of them are wrong, even the official one. In order to prove that I have written some tests and script to run them on multiple solutions. You can see results here. I suggest adding these tests to the task the same way as in the word count task.

adg commented 8 years ago

We should probably rework the example to be something that is more achievable in the context of a web browser.

We should also fix the official solution if it is broken.

adg commented 8 years ago

@amorgun have you tried running your solution against Go 1.6? On my machine your test cases trigger a concurrent map write panic:

My solution: OK

Test case #3
found: a "node a"
found: b "node b"
found: c "node c"
My solution: OK

Test case #4
found: 0 "node #0"
found: 3 "node #3"
found: 1 "node #1 with very slow fetching"
found: 2 "node #2"
found: 4 "node #4"
found: 6 "node #6"
found: 5 "node #5"
My solution: OK

Test case #5
found: 0 "node #0"
found: 3 "node #3"
found: 1 "node #1"
found: 4 "node #4"
found: 5 "node #5"
found: 2 "node #2"
fatal error: concurrent map read and map writefound: 13 "node #13"

Your test harness has a bug:

type warningFetcher struct {
        fetcher        crawler.Fetcher
        alreadyFetched map[string]bool
        lock           sync.Mutex
}

func (f warningFetcher) Fetch(url string) (string, []string, error) {
        f.lock.Lock()
        defer f.lock.Unlock()
        if f.alreadyFetched[url] {
                fmt.Printf("WARNING: Url %s has been fetched multiple times\n", url)
        }
...

The Fetch method is declared on warningFetcher which has an embedded sync.Mutex. So when you're passing around a warningFetcher you are copying the mutex (defeating its purpose).

The fix is to declare the Fetch method on *warningFetcher and pass the warningFetcher around by pointer, not by value.

-func (f warningFetcher) Fetch(url string) (string, []string, error) {
+func (f *warningFetcher) Fetch(url string) (string, []string, error) {
        f.lock.Lock()
        defer f.lock.Unlock()
        if f.alreadyFetched[url] {
@@ -40,5 +41,5 @@ func (f warningFetcher) Fetch(url string) (string, []string, error) {
 }

 func newWarningFetcher(fetcher crawler.Fetcher) crawler.Fetcher {
-       return warningFetcher{fetcher: fetcher, alreadyFetched: make(map[string]bool)}
+       return &warningFetcher{fetcher: fetcher, alreadyFetched: make(map[string]bool)}
amorgun commented 8 years ago

@adg Thank you for pointing out the bug. I have fixed it as you suggested. I also have enabled build against Go 1.6 in travis so at least it works there. Let me know if you still have problems running it.

Windsooon commented 5 years ago

I will try to work on it from tomorrow.

Windsooon commented 5 years ago

I'm not sure we should add more tests for the example (since all the examples don't have tests). But I do think we should update the example to make it correct.