scrapy-plugins / scrapy-playwright

🎭 Playwright integration for Scrapy
BSD 3-Clause "New" or "Revised" License
911 stars 101 forks source link

Can the 'context_launch_lock' in the '_create_page' function be safely removed? #238

Closed Yoshiya1997 closed 7 months ago

Yoshiya1997 commented 8 months ago
  async def _create_page(self, request: Request, spider: Spider) -> Page:
        """Create a new page in a context, also creating a new context if necessary."""
        context_name = request.meta.setdefault("playwright_context", DEFAULT_CONTEXT_NAME)
        # this block needs to be locked because several attempts to launch a context
        # with the same name could happen at the same time from different requests
        local_time = time()
        print("spider_start", request.url, context_name)
        # async with self.context_launch_lock:
        ctx_wrapper = self.context_wrappers.get(context_name)
        if ctx_wrapper is None:
            ctx_wrapper = await self._create_browser_context(
                name=context_name,
                context_kwargs=request.meta.get("playwright_context_kwargs"),
                spider=spider,
            )

Is there any potential risk if I remove the lock while using UUID for each context creation and only launching one page?

(edited for syntax highlighting)

elacuesta commented 8 months ago

Please elaborate on the motivation for this.