scrapy-plugins / scrapy-playwright

🎭 Playwright integration for Scrapy
BSD 3-Clause "New" or "Revised" License
1.03k stars 113 forks source link

Cookie Seemed to be destroyed in between #196

Closed Binit-Dhakal closed 1 year ago

Binit-Dhakal commented 1 year ago

Description

I am trying to login into the https://countyfusion5.kofiletech.us/countyweb/loginDisplay.action?town=&countyname=Franklin by clicking the "Login as a guest button" using playwright-scrapy, but am redirected to this URL "https://countyfusion5.kofiletech.us/countyweb/loginDisplay.action?countyname=Base&errormsg=error.invalid.struts2.token". When I try with playwright alone, there seems to be no problem but with scrapy-playwright I am getting the issue.

Code to reproduce

class SfranklinohioSpider(scrapy.Spider):
    name = 'sFranklinOhio'

    def start_requests(self):
        yield scrapy.Request(
            "https://countyfusion5.kofiletech.us/countyweb/loginDisplay.action?countyname=Franklin",
            meta={
                'playwright': True,
                "playwright_include_page": True,
                "playwright_page_methods": [
                    PageMethod("click", "input.basebold1"),
                ],
                "playwright_context": "har_saver"
            },
            callback=self.login_as_guest
        )

    async def login_as_guest(self, response):
         await page.wait_for_timeout(10000)

My takeway

I used har_saver as playwright context with code

PLAYWRIGHT_CONTEXTS = {
  "har_saver": {
    'record_har_path': 'check_franklin.har'
  },
}

to see the har file and I can see that during login POST request cookies are not sent. I don't know the reason for this but will appreciate it if anyone can help with this or point me in the correct direction.

Please let me know if I need to provide any thing more to help you reproduce this bug.

Thank you, Binit

Binit-Dhakal commented 1 year ago

The same issue was raised in #149 and the https://github.com/scrapy-plugins/scrapy-playwright/issues/149#issuecomment-1352446175 answer of PLAYWRIGHT_PROCESS_REQUEST_HEADERS=None did the trick for me. Thank you