gocolly / colly

Elegant Scraper and Crawler Framework for Golang
https://go-colly.org/
Apache License 2.0
23.2k stars 1.76k forks source link

Why the program finished running, c.wait () did not release, has been waiting, this may be a bug #96

Closed tx991020 closed 6 years ago

tx991020 commented 6 years ago

package main

import (

"github.com/gocolly/colly"
"github.com/gocolly/colly/debug"

"time"

)

func main() {

urls := []string{"https://weibo.cn/repost/FBrYpiw8h?uid=1153760245&rl=1", "https://weibo.cn/repost/FBrXSqrIl?uid=2137005731&rl=1", "https://weibo.cn/repost/FBrXOlMmQ?uid=5131689041&rl=1", "https://weibo.cn/repost/FBrXJBCQs?uid=1701023441&rl=1", "https://weibo.cn/repost/FBrXg4ZuX?uid=5999431007&rl=1", "https://weibo.cn/repost/FBrXcuadg?uid=5819066338&rl=1", "https://weibo.cn/repost/FBrWEgEor?uid=3517902151&rl=1","https://weibo.cn/repost/FBrWmuTYh?uid=2974402113&rl=1", "https://weibo.cn/repost/FBrVZtT1p?uid=5533885122&rl=1",
"https://weibo.cn/repost/FBrVrqA5T?uid=1613781965&rl=1", "https://weibo.cn/repost/FBrYpiw8h?uid=1153760245&rl=1", "https://weibo.cn/repost/FBrXSqrIl?uid=2137005731&rl=1", "https://weibo.cn/repost/FBrXOlMmQ?uid=5131689041&rl=1", "https://weibo.cn/repost/FBrXJBCQs?uid=1701023441&rl=1", "https://weibo.cn/repost/FBrXg4ZuX?uid=5999431007&rl=1", "https://weibo.cn/repost/FBrXcuadg?uid=5819066338&rl=1", "https://weibo.cn/repost/FBrWEgEor?uid=3517902151&rl=1", "https://weibo.cn/repost/FBrWmuTYh?uid=2974402113&rl=1",
"https://weibo.cn/repost/FBrVZtT1p?uid=5533885122&rl=1", "https://weibo.cn/repost/FBrVrqA5T?uid=1613781965&rl=1", "https://weibo.cn/repost/FBrUXncEG?uid=5046939400&rl=1"}

// Instantiate default collector
c := colly.NewCollector(
    // Turn on asynchronous requests
    colly.Async(true),
    // Attach a debugger to the collector
    colly.Debugger(&debug.LogDebugger{}),

)
c.SetRequestTimeout(2*time.Second)
// Limit the number of threads started by colly to two
// when visiting links which domains' matches "*httpbin.*" glob

// Start scraping in five threads on https://httpbin.org/delay/2
for _,v := range urls{
    c.Visit(v)
}

c.Wait() }

asciimoo commented 6 years ago

@tx991020 thanks the report, could you verify that 15a3cff fixes the issue?