scrapy-plugin Search Results

microsoft/playwright-python #1170

[Feature] A way to prevent SIGINT (cmd+c) being passed to th…

I have been using Playwright with the Scrapy web scraping framework, this is the plugin: https://github.com/scrapy-plugins/scrapy-playwright Scrapy is designed to cleanly shutdown on SIGINT, saving…

samwillis updated 3 months ago

scrapy/scrapy #5510

Support per-request download handler override

It would be great if a plugin like https://github.com/scrapy-plugins/scrapy-playwright did not had to force you to drive all requests through its download handlers, and instead you could drive certain…

Gallaecio updated 2 months ago

scrapy-plugins/scrapy-playwright #243

Cannot download binary file (PDF) with Chromium headless=new…

I am facing an issue when using chromium, when trying to download a PDF file: the response.body is the viewer plugin HTML, not the bytes. There's already a concerned fix here: https://github.com/s…

tommylge updated 8 months ago

scrapinghub/scrapinghub-entrypoint-scrapy #78

Missing `Parent Request #`, `Duration`, and `Response Size` …

Currently, the requests coming from `scrapy_zyte_api.providers.ZyteApiProvider` doesn't create the **Parent Request #** field in Scrapy Cloud. In the example above, Request 1 should have a **Pa…

BurnzZ updated 5 months ago

scrapy/scrapy #5111

A doubt about "Sharing the root directory between projects"

I was going to ask this question on StackOverflow, but I failed because of the chinese internet. So I have to ask this question here. If this is not in compliance, I am sorry about it. I'm learning…

nowari updated 1 year ago

scrapy-plugins/scrapy-splash #270

Review url being optional for SplashRequest

As part of https://github.com/scrapy-plugins/scrapy-splash/pull/269, the `url` parameter to `SplashRequest` is no longer optional. @elacuesta noticed that this is a backward-incompatible change. Mo…

Gallaecio updated 3 years ago

github/codeql #2455

LGTM.com - false positive when mixin init calls super().…

In this case: ```python class A: def __init__(self): pass class B: def __init__(self): super(B, self) class C(B, A): pass ``` LGTM reports that `A.__init…

Gallaecio updated 2 years ago

scrapy-plugins/scrapy-zyte-smartproxy #94

Blacklist domains

I was setuping autoextract in scrapy cloud on a project with crawlera addon. Autoextract queries were routed through crawlera. Idea is to blacklist autoextract domain by default. It may have sense for…

whalebot-helmsman updated 3 years ago

scrapinghub/splash #1086

splash cookie login not open ?

![image](https://user-images.githubusercontent.com/43572770/100686233-d7a75980-33b8-11eb-8c62-8484a15881eb.png) Clone issue: https://github.com/scrapy-plugins/scrapy-splash/issues/272

hanwei996 updated 3 years ago

kagxin/blog #36

scrapy备忘

* 洋葱网络 tor * 随机选择一个代理 [scrapy-proxies](https://github.com/aivarsk/scrapy-proxies) * 一些免费代理 [xiaoer](http://www.xiaoerdaili.com/) [西刺](https://www.xicidaili.com/) * 收费集成代理 [scrapy-crawlera](https://…

kagxin updated 4 years ago

953 results for scrapy-plugin

953 results
for scrapy-plugin