robertkrimen / otto

A JavaScript interpreter in Go (golang)
http://godoc.org/github.com/robertkrimen/otto
MIT License
8.01k stars 584 forks source link

How to obtain the values of jump related variables such as window.location.href? #510

Open chushuai opened 10 months ago

chushuai commented 10 months ago

I wrote a crawler. In order to crawl to the jumps in js, I used regular expressions to extract the jump information in js. However, this matching method cannot obtain complex js jumps, so I want to improve it through js semantic recognition. Accuracy

Below is my regex:

location\s*?=\s*?["'](.*?)["']
window.location\s*?=\s*?["'](.*?)["']
window.location.href\s*?=\s*?["'](.*?)["']
self\.location.*?=\s*?["'](.*?)["\']
top\.location.*?=\s*?["'](.*?)["\']
location\.replace\(["'](.*)["']\)
location\.assign\(["'](.*)["']\)
window.open\(["'](.*?)["']

For the following piece of JS code, it is obvious that the jump cannot be extracted through regular expressions:

<script>window.onload=function(){ url ='/webui';window.location.href=url;}</script>