-
It made an error After PNG output at large web page.
server:AWS EC2 ubuntu14.04 LTS
Exception has occurred in the following.
```
splash/qtrender_image.py:514:qimage_to_pil_image
buf = qimag…
-
Blogs that require login will 302 redirect to https://www.tumblr.com/login_required/
We need a list of these, as currently the items don't fail
-
**Is your feature request related to a problem? Please describe.**
I'm scanning a modern web app, which uses jQuery to create the page. Since the web page uses a lot of external resources, I've set u…
-
SEO4Ajax stopped crawling after 14 posts. It crawled 40 total results then stopped (final 15 were profile links then). After talking to SEO4Ajax support they indicate that getting the crawler to compl…
-
Not sure how this would work yet, but if you have multiple sites in a single umbraco instance, you might need to have different robots.txt files served up and handled. Particularly for the different s…
-
Thanks for making this available.
Can you explain the use logic for the system a little.
1) is it correct that first the system will fetch the data and once the list of sites have been handled,…
-
### Description
I am reaching out as a representative of the ANTI ONLINE GAMBLING team from Pelita Bangsa University, working in collaboration with the Ministry of Communication and Information Tec…
-
### katana version: v1.0.4
### Current Behavior: using the `-jsonl` and `-headless` options for a katana crawl results in an error: `[hybrid:RUNTIME] context deadline exceeded
###…
-
Is it possible to make a redirect based on Accepted-Language header?
-
Define a robots.txt file to help search crawlers better understand our site and prevent them from crawling unnecessary links.
More information: https://moz.com/learn/seo/robotstxt
https://develo…