-
run-spider execution
``` clj
(run-spider "http://yang-wei/github.io")
;; :parent http://yang-wei.github.io/archives, :link (http://yang-wei.github.io/blog/2016/02/08/time-travel- back-to-understand-…
-
I want to scrape a bunch of quests off the project1999 wiki, which is based on the old game EverQuest. The original questing structure for the game is actually LUA scripts, but instead of a strictly s…
-
Steps to Repro:
Use any URL with a port number e.g. http://selfridges.cloudopsguys.com:81/
Result:
Error message "The input appears to be a DNS hostname but cannot match TLD against known list, Th…
-
WARNING: preg_match(): Compilation failed: invalid range in character class at offset 4 in /home/ymserver/vhost/ios_wall_ads/spider/protected/simplehtmldom/simple_html_dom.php on line 1372
提示这个错误,怎…
-
repro:
```
name: Unit
on:
push:
jobs:
smoke-test:
name: Smoke
runs-on: ubuntu-latest
container:
image: ghcr.io/mvorisek/image-php:${{ matrix.php }}
strategy:
…
-
Node is great and V8 is great.
But why not take Mozilla's Spider/Eon/Odin/Monkey and create a real alternative to Node? Namely something that would be async/callback Node-compatible but have the alt…
-
I have found authentication plugin behaviour to be broken.
According to [the documentation](http://docs.w3af.org/en/latest/authentication.html#form-authentication): "Authentication plugins are a spec…
-
```
Clone of the 1981 Night Stalker video game by Mattel Electronics
You're on the run. Your attackers are relentless robots. Destroy one and it's
replaced by an even smarter, faster robot. It's a n…
-
Phorum does not set the "last modified" HTTP header on any pages (index or messages). This effectively prevents google (yahoo, etc.) from re-spidering the phorum, since after the first access they in…
-
We should compare the list of user-agents we match vs [isbot](https://github.com/BryanMorgan/isbot/blob/main/src/bot_regex_patterns.txt) to see if we are missing any.