-
It refuses to accept the links from the archive.org sitemaps:
20:36:12,023 WARN [crawlercommons.sitemaps.SiteMapParser] (IdxTask) URL: https://archive.org/details/ARCHIVEIT-3490-TWELVE_HOURS-FHIFE…
-
Hello, I'm using the dockerized version of goaccess, static file generation kinda works, but I cant get up and running the docker image.
My config file contains both ssl certificate, and ssl key, l…
-
Using OSX 10.10.5, Python 2.7.11:
I ran `make localbuild` as specified in https://github.com/linkcheck/linkchecker/blob/master/doc/development.mdwn, and got the following build progress and warning…
-
My code
```
import robots from 'robots';
const robotsParser = new robots.RobotsParser();
const url = 'https://google.com/robots.txt';
return new Promise((resolve, reject) =…
-
### Steps to reproduce
After running `rails new rails_playground --webpack` I'm getting this error:
```
Using --database=postgresql from /Users/yurii/.railsrc
create
create …
-
Navigate to: https://web.archive.org/web/http://bugs.chromium.org/p/project-zero/issues/detail?id=1139
see that wayback says it's blocked by robots.txt:
![image](https://cloud.githubusercontent.…
-
I apologize if this is a generator-angular issue, but I've gone back and forth with so many fixes it's hard to keep all the wires straight.
```
yo --version && echo $PATH $NODE_PATH && node -e 'conso…
-
The first time we process a robots.txt file, or when we re-process it, we should see if there's a sitemap (or sitemaps). If so, then we could output the sitemap URL(s) as well as the URL being checked…
-
Steps to reproduce:
1. `python setup.py sdist`
2. `virtualenv /tmp/env && /tmp/env/bin/pip install sdist/LinkChecker*`
Expected behavior:
- linkchecker is installed successfully
Actual beha…
-
Hello everyone,
thanks for looking into my issue!
- [x] This is a question about using the theme.
- [ ] This is a feature request.
- [ ] I believe this to be a bug with the theme.
- [x] I hav…