-
I'm currently trying to use snscrape to download Tweets from Twitter. According to my calculations, I should be getting around 2,200,000 Tweets in total by the time it finishes. I'm concerned about th…
-
## Environment
Version: 0.7.4
High number of applications to scrape.
for updates see also #170
## Summary / Problem Statement
We need to rate limiting on CFCC metadata calls.
## Observ…
-
*Description*
It would be very helpful to expose a histogram (at least average, P95, P99, MAX) for the number of ssstables read in the context of a single CQL read.
This is very helpful when we wa…
-
Assume crawler have set allowed_domains to below list:
`self.allowed_domains = ['albert.zgora.pl']`
Scrapy shouldn't go beyond 'albert.zgora.pl' domain.
But it goes to:
https://www.tumblr.com/wi…
-
Hi,
I've set it up in a Docker container and it's working fine, but some statistics are missing. For example, Libarys doesn't show up, while other things seem to be working properly. Do I need Tdar…
-
### Problem description
I use nova with an SMB server that auto re-scans on every app launch. This used to work flawlessly in the past, but not anymore.
Now, whenever I browse Episodes by date, …
-
This post follows an issue I posted in truckblocks-docker regarding rpc connections to erigon breaking, and further discussion in the Erigon discord. In an effort to overcome the issues I explored bat…
-
I've found real-world editing traces to have very different performance profiles than simulated (random) editing traces. For diamond types, I've ended up writing [a script](https://github.com/josephg/…
-
Hi Matt, I'm trying to scrape subreddit posts within a time period of six months, with a limit set to none. After irregular periods of time however, the connection gets broken apparently. Following is…
-
# Feature Proposal
This request is inspired by the recent security feature that was added by NPM.
In May of 2018, NPM added automatic dependency auditing support. When you run `npm install` the to…