krmaxwell / maltrieve

A tool to retrieve malware directly from the source for security researchers.
GNU General Public License v3.0
563 stars 183 forks source link

vxvault URL #161

Open marcocova opened 9 years ago

marcocova commented 9 years ago

The vxvault source's URL responds with a 302 response:

$ curl -v http://vxvault.siri-urz.net/URL_List.php 
* Hostname was NOT found in DNS cache
*   Trying 213.186.33.5...
* Connected to vxvault.siri-urz.net (213.186.33.5) port 80 (#0)
> GET /URL_List.php HTTP/1.1
> User-Agent: curl/7.37.1
> Host: vxvault.siri-urz.net
> Accept: */*
> 
< HTTP/1.1 301 Moved Permanently
< Set-Cookie: rd=R3047010670; path=/; expires=Fri, 24-Apr-2015 00:22:55 GMT
* Server nginx is not blacklisted
< Server: nginx
< Date: Tue, 21 Apr 2015 12:16:09 GMT
< Content-Type: text/html
< Content-Length: 178
< Connection: close
< Location: http://vxvault.net//URL_List.php
< 
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>nginx</center>
</body>
</html>
* Closing connection 0

In these situation, the response object from requests contains the target URL, not the original one:

>>> import requests
>>> r = requests.get("http://vxvault.siri-urz.net/URL_List.php")
>>> r.url
u'http://vxvault.net//URL_List.php'

In turn, this makes this line fail with a KeyError:

$ python maltrieve.py 
Processing source URLs
Completed source processing
Traceback (most recent call last):
  File "maltrieve.py", line 514, in <module>
    main()
  File "maltrieve.py", line 487, in main
    malware_urls.update(source_urls[response.url](response.text))
KeyError: u'http://vxvault.net//URL_List.php'
krmaxwell commented 9 years ago

Thanks for the report! It looks like we need to update the URL and also improve our exception handling.

krmaxwell commented 9 years ago

This is still open because of the exception handling, FYI.

mlawsonis commented 9 years ago

Can I just comment out the VXVault line as a work around? I removed VX Vault and it's working fine.