-
In an effort to subdue crawlers, add a `robots.txt` to `publicAPI/static`
http://stackoverflow.com/questions/14048779/with-flask-how-can-i-serve-robots-txt-and-sitemap-xml-as-static-files
Notes:…
-
The first site I want to crawl is "https://www.reddit.com/". Below is the CUJs to consider in designing our crawler:
* I want to store the crawled result in DB (Support [postgresql](https://www.postg…
-
It would be really, really neat if we could choose to inherit hiding from external search engines. If you can find a sub page through a search engine, it's really easy for users find the parent page. …
-
linkchecker ships a vendored copy of `linkcheck/robotparser2.py` from the python standard library. i only found out about it after auditing the Makefile as part of #398. it seems it has significant di…
-
last line in apache log file:
```
66.249.65.212 - - [20/Sep/2018:12:29:58 -0400] "GET /crcns/.git/objects/29/f8a0ae8c2ad4e7534b12f3cb68b9e8247b1933 HTTP/1.1" 200 1745 "-" "Mozilla/5.0 (Linux; Androi…
-
BPO | [35457](https://bugs.python.org/issue35457)
--- | :---
Nosy | @terryjreedy, @berkerpeksag, @tirkarthi, @andreburgaud
*Note: these values reflect the state of the issue at the time it was migrat…
-
### Description of the bug / feature
Since switching to the default Push transport (#10931) we always get an “Exception in push connection” when logging out.
Note: the exception is not thrown when…
-
robotspy==0.8.0
```python
import robots
content = """
User-agent: mozilla/5
Disallow: /
"""
check_url = "https://example.com"
user_agent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWe…
-
```
--- IN PROGEESS ---
```
Original issue reported on code.google.com by `m.zakrze...@gmail.com` on 25 Feb 2012 at 3:38
-
### Version
4.0.5
### Reproduction link
[https://www.npmjs.com/package/@vue/cli-plugin-pwa](https://www.npmjs.com/package/@vue/cli-plugin-pwa)
localhost:8080/service-worker.js is returning som…