-
Why not have a github action scrape all the sources, then update a single file or series of yaml files in the repo? Seems that we are all scraping these sites it might be easier on the websites, and l…
-
as a user
when i create and configure a scraper
then i would like a task created to actually scrape the data
so now that flux has the `prometheus.scrape` endpoint, we can replace the existing scr…
-
Please add if possible:
https://takejav.net/
https://extreme-fetish.org/
https://japanfemdom.net/
-
Would it be possible to have better documentation for running with the example below.
I would like to process and transact from scraped data to your servers on my main server (better resources CPU …
-
so I'm a pretty new developer and I have read through all the getting started guides and I'm really lost. there are a lot of files and I cant seem to find the one where I input a url and it returns wi…
-
Make logging message templates consistent for all steps of all scrapers, e.g.
- After scraping a post and returning a ScraperResult
- After downloading an attachment
- After uploading an attachment…
-
- [x] https://timesofindia.indiatimes.com/home/headlines
- [x] https://www.ndtv.com/top-stories
- [x] https://www.hindustantimes.com/latest-news
- [ ] https://economictimes.indiatimes.com/news
- […
-
Some provinces and territories aggregate data (see the disabled `*_municipalities` scrapers in [scrapers-ca](https://github.com/opencivicdata/scrapers-ca/tree/master/disabled)). Depends on https://git…
-
I thought it might be handy to have a single issue to track requested scrapers in one place. Here's all the ones requested so far from both here and https://feedback.xbvr.app/
- [x] VRConk
- [x] V…
-
### Motivation
In the past few months, we have seen AI scraping bots become more and more prevalent, especially ClaudeBot. Personally, I've seen it do as much as 10K requests in 24 hours, with some…