gocolly / colly

Elegant Scraper and Crawler Framework for Golang
https://go-colly.org/
Apache License 2.0
23.4k stars 1.77k forks source link

Log in with post does not return response #532

Closed chunfytseng closed 4 years ago

chunfytseng commented 4 years ago

Log in with post does not return response, but returns err. I want response, because the success of login is checked by the body, but the program does not execute c.onresponse(), c.onerror(), c.onrequest(), I don't know what happened. When I debugged, I found that colly.go What should I do when I want the data in response in func (c * collector) fetch (...)?

      c := colly.NewCollector(colly.UserAgent(uA),
        colly.AllowedDomains("www.xxx.net","xxx.net"),colly.MaxDepth(1))
    err := c.Post("http://www.xxx.net/cmsLogin.jsp", map[string]string{"login_name": "admin", "login_pass": "pass"})
        // never call   
        if err != nil {
        loger.Fatalf("login fail. err:%s \n",err)
    }

    // never call OnRequest
    c.OnRequest(func(rq *colly.Request) {
        fmt.Println("OnRequest")
    })

    // never call OnError
    c.OnError(func(rp *colly.Response, e error) {
        loger.Fatalf("err rp: %s %s \n",string(rp.Body),e)
    })

    // never call OnResponse
    c.OnResponse(func(rp *colly.Response) {
        if bodyStr := strings.TrimSpace(string(rp.Body)); bodyStr == "[{id:1}]" {
            loger.Fatalln("login fail.")
        }
        loger.Println("login success.")
    })

    //c.Visit("http://www.xxx.net/index.jsp")
    c.Wait()
asciimoo commented 4 years ago

Log in with post does not return response

You perform POST request before attaching the OnResponse callback, so it won't be called on that request. Is this answers your question?

chunfytseng commented 4 years ago

yeah.

获取 Outlook for iOShttps://aka.ms/o0ukef


发件人: Adam Tauber notifications@github.com 发送时间: Thursday, August 27, 2020 6:31:29 AM 收件人: gocolly/colly colly@noreply.github.com 抄送: chunfytseng chunfy.tseng@outlook.com; Author author@noreply.github.com 主题: Re: [gocolly/colly] Log in with post does not return response (#532)

Log in with post does not return response

You perform POST request before attaching the OnResponse callback, so it won't be called on that request. Is this answers your question?

― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/gocolly/colly/issues/532#issuecomment-681157194, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADDJW3NRDV44TYI66ICMTJLSCWEMDANCNFSM4QIWMM5Q.

chunfytseng commented 4 years ago

Can multiple accounts log in and crawl at the same time?

获取 Outlook for iOShttps://aka.ms/o0ukef


发件人: 曾 全飞 chunfy.tseng@outlook.com 发送时间: Thursday, August 27, 2020 9:27:21 AM 收件人: gocolly/colly reply@reply.github.com; gocolly/colly colly@noreply.github.com 抄送: Author author@noreply.github.com 主题: Re: [gocolly/colly] Log in with post does not return response (#532)

yeah.

获取 Outlook for iOShttps://aka.ms/o0ukef


发件人: Adam Tauber notifications@github.com 发送时间: Thursday, August 27, 2020 6:31:29 AM 收件人: gocolly/colly colly@noreply.github.com 抄送: chunfytseng chunfy.tseng@outlook.com; Author author@noreply.github.com 主题: Re: [gocolly/colly] Log in with post does not return response (#532)

Log in with post does not return response

You perform POST request before attaching the OnResponse callback, so it won't be called on that request. Is this answers your question?

― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/gocolly/colly/issues/532#issuecomment-681157194, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADDJW3NRDV44TYI66ICMTJLSCWEMDANCNFSM4QIWMM5Q.

asciimoo commented 4 years ago

Can multiple accounts log in and crawl at the same time?

Yes, but you have to use multiple collectors in this case, because one collector uses one cookie jar, so it can handle only one session per site.

chunfytseng commented 4 years ago

Instead of using collector clone? You must create a new collector?

获取 Outlook for iOShttps://aka.ms/o0ukef


发件人: Adam Tauber notifications@github.com 发送时间: Thursday, August 27, 2020 9:02:43 PM 收件人: gocolly/colly colly@noreply.github.com 抄送: chunfytseng chunfy.tseng@outlook.com; Author author@noreply.github.com 主题: Re: [gocolly/colly] Log in with post does not return response (#532)

Can multiple accounts log in and crawl at the same time?

Yes, but you have to use multiple collectors in this case, because one collector uses one cookie jar, so it can handle only one session per site.

― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/gocolly/colly/issues/532#issuecomment-681934291, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADDJW3NE2C2OEIUCTP5I7NTSCZKPHANCNFSM4QIWMM5Q.

asciimoo commented 4 years ago

You must create a new collector?

Create a new collector, clone will share the cookie jar between the collectors.

chunfytseng commented 4 years ago

thx.

获取 Outlook for iOShttps://aka.ms/o0ukef


发件人: Adam Tauber notifications@github.com 发送时间: Monday, August 31, 2020 8:26:36 PM 收件人: gocolly/colly colly@noreply.github.com 抄送: chunfytseng chunfy.tseng@outlook.com; Author author@noreply.github.com 主题: Re: [gocolly/colly] Log in with post does not return response (#532)

You must create a new collector?

Create a new collector, clone will share the cookie jar between the collectors.

― You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/gocolly/colly/issues/532#issuecomment-683747189, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADDJW3IBGB2W5YPNY43AYKDSDOJHZANCNFSM4QIWMM5Q.

IgorDePaula commented 11 months ago

You must create a new collector?

Create a new collector, clone will share the cookie jar between the collectors.

How?