003random / 003Recon

Some tools to automate recon - 003random
https://poc-server.com/
294 stars 74 forks source link

Change regex for more results in javascript_file_extractor.py #8

Closed karelorigin closed 6 years ago

karelorigin commented 6 years ago

Hi @003random,

I noticed that you were using this regex to quickly grab javascript files from script tags: regex = r'script src="(.*?)"'. This regex doesn't find everything unfortunately, for example:

Attributes in script tags (using the current regex):

>>> string = "<script async src=\"/test.js\">" 
>>> regex = r'script src="(.*?)"'
>>> re.findall(regex, string, re.MULTILINE)
[]

Attributes in script tags (using my regex):

>>> string = "<script async src=\"/test.js\">" 
>>> regex = r'<script.*src=[\'|"](.*)[\'|"]'
>>> re.findall(regex, string, re.MULTILINE)
['/test.js']

Single quotes (using the current regex):

>>> string = "<script src='/test.js'>" 
>>> regex = r'script src="(.*?)"'
>>> re.findall(regex, string, re.MULTILINE)
[]

Single quotes (using my regex):

>>> string = "<script src='/test.js'>" 
>>> regex = r'<script.*src=[\'|"](.*)[\'|"]'
>>> re.findall(regex, string, re.MULTILINE)
['/test.js']

EDIT: I noticed that you were using a different regex to find js files encapsulated in single quotes if you didn't find anything using the first regex. I suggest using this one instead since it ignores attributes in script tags.

Karel.

003random commented 6 years ago

Hey Karel, Thanks, you are indeed right. I will merge it now. :)