krmaxwell / maltrieve

A tool to retrieve malware directly from the source for security researchers.
GNU General Public License v3.0
563 stars 183 forks source link

files aren't downloaded #182

Open lhunt23 opened 7 years ago

lhunt23 commented 7 years ago

Hello,

I'm running Maltrieve on Ubuntu 16.0.4. I installed the dependencies as described in the installation instructions. When I 'python maltrieve.py', the script doesn't download any files. Please see the output below and let me know if you have any suggestions.


python maltrieve.py -d /home/acme/malware/020517 Processing source URLs Completed source processing /usr/local/lib/python2.7/dist-packages/bs4/init.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 514 of the file maltrieve.py. To get rid of this warning, change code that looks like this:

BeautifulSoup([your markup])

to this:

BeautifulSoup([your markup], "lxml")

markup_type=markup_type)) Downloading samples, check log for details Completed downloads


tail maltrieve.log 2017-02-05 11:49:24 140020353632000 Starting new HTTP connection (1): malc0de.com 2017-02-05 11:49:29 140020353632000 http://www.malwaredomainlist.com:80 "GET /hostslist/mdl.xml HTTP/1.1" 200 4938 2017-02-05 11:49:29 140020353632000 http://malc0de.com:80 "GET /rss/ HTTP/1.1" 200 None 2017-02-05 11:49:30 140020353632000 http://malwareurls.joxeankoret.com:80 "GET /normal.txt HTTP/1.1" 200 11192 2017-02-05 11:49:30 140020353632000 http://support.clean-mx.de:80 "GET /clean-mx/rss?scope=viruses&limit=0%2C64 HTTP/1.1" 200 918 2017-02-05 11:49:30 140020353632000 http://vxvault.net:80 "GET /URL_List.php HTTP/1.1" 200 None 2017-02-05 11:49:30 140020353632000 https://zeustracker.abuse.ch:443 "GET /monitor.php?urlfeed=binaries HTTP/1.1" 200 3869 2017-02-05 11:49:32 140020353632000 http://urlquery.net:80 "GET / HTTP/1.1" 200 4766 2017-02-05 11:49:33 140020353632000 Dumping past URLs to urls.json 2017-02-05 11:49:33 140020353632000 Dumping hashes to hashes.json

clayball commented 7 years ago

I can confirm.. nothing is being downloaded.

lhunt23 commented 7 years ago

Thanks for the follow up. If you have any suggestions as to how to get the script working again, please let me know. I've found this script to be extremely useful and appreciate you making it available.

hi-T0day commented 7 years ago

Add sudo before 'python maltrieve.py' or change python to python3 Good Luck!

hi-T0day commented 7 years ago

Sorry, I gave you an wrong answer just now. But I got it now. You can change in "maltrieve.py" def process_urlquery(response): soup = BeautifulSoup(response) urls = set() for t in soup.findall("table", class="test"): for a in t.find_all("a"): urls.add('http://' + re.sub('&', '&', a.text)) return urls

to:

def process_urlquery(response): soup = BeautifulSoup(response, "html.parser") urls = set() for t in soup.findall("table", class="test"): for a in t.find_all("a"): urls.add('http://' + re.sub('&', '&', a.text)) return urls

lhunt23 commented 7 years ago

Hello,

Thanks for your response. I made the suggested changes and the script still isn’t downloading files. Please let me know if you have any additional suggestions. Thanks.

def process_urlquery(response): soup = BeautifulSoup(response, "html.parser") urls = set() for t in soup.findall("table", class="test"): for a in t.find_all("a"): urls.add('http://' + re.sub('&', '&', a.text)) return urls

root@ubuntu:~/scripts/maltrieve-master# python maltrieve.py

Processing source URLs Completed source processing Downloading samples, check log for details Completed downloads

From: hi-T0day [mailto:notifications@github.com] Sent: Saturday, March 04, 2017 7:31 AM To: krmaxwell/maltrieve maltrieve@noreply.github.com Cc: Lindsay Hunt lhunt@paloaltonetworks.com; Author author@noreply.github.com Subject: Re: [krmaxwell/maltrieve] files aren't downloaded (#182)

Sorry, I gave you an wrong answer just now. But I got it now. You can change in "maltrieve.py" def process_urlquery(response): soup = BeautifulSoup(response) urls = set() for t in soup.findall("table", class="test"): for a in t.find_all("a"): urls.add('http://' + re.sub('&', '&', a.text)) return urls to: def process_urlquery(response): soup = BeautifulSoup(response, "html.parser") urls = set() for t in soup.findall("table", class="test"): for a in t.find_all("a"): urls.add('http://' + re.sub('&', '&', a.text)) return urls

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_krmaxwell_maltrieve_issues_182-23issuecomment-2D284148635&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=q5gVyZzvxIN7Ph17gPhHTO7Q4aRkyOZ3mFKqvntA0Is&s=Bo83Xt6s_y-i4zVfjz2RploQJkZU9XGrLykpI64rA1I&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AYWdcofBGCSbHHGMCdu44BJXxVTK8oyxks5riVl5gaJpZM4L3jQB&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=q5gVyZzvxIN7Ph17gPhHTO7Q4aRkyOZ3mFKqvntA0Is&s=bYq5lVPDDGgRSi42bxp-2wHPiMbGqrvR1YoaWdCdWz4&e=.

panw-ren commented 7 years ago

Having the same issue.

attrs==15.2.0 BeautifulSoup==3.2.1 beautifulsoup4==4.5.3 bs4==0.0.1 chardet==2.3.0 configobj==5.0.6 cryptography==1.2.3 ecdsa==0.13 enum34==1.1.2 feedparser==5.2.1 gevent==1.2.1 greenlet==0.4.12 idna==2.0 ipaddress==1.0.16 Landscape-Client==16.3+bzr834 ndg-httpsclient==0.4.0 PAM==0.4.2 paramiko==1.16.0 pyasn1==0.1.9 pyasn1-modules==0.0.7 pycrypto==2.6.1 pyOpenSSL==0.15.1 pyserial==3.0.1 python-apt==1.1.0b1 python-debian==0.1.27 python-magic==0.4.12 requests==2.13.0 scapy==2.3.3 service-identity==16.0.0 six==1.10.0 Twisted==16.0.0 zope.interface==4.1.3

user1@ubuntu-template:~/maltrieve/maltrieve-0.7/files$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.1 LTS Release: 16.04 Codename: xenial

user1@ubuntu-template:~/maltrieve/maltrieve-0.7$ sudo ./maltrieve.py Processing source URLs Completed source processing Downloading samples, check log for details Completed downloads

user1@ubuntu-template:~/maltrieve/maltrieve-0.7$ more urls.json []

user1@ubuntu-template:~/maltrieve/maltrieve-0.7$ sudo ./maltrieve.py -d ./files/ Processing source URLs Completed source processing Downloading samples, check log for details Completed downloads user1@ubuntu-template:~/maltrieve/maltrieve-0.7$ cd files/ user1@ubuntu-template:~/maltrieve/maltrieve-0.7/files$ ls user1@ubuntu-template:~/maltrieve/maltrieve-0.7/files$

hi-T0day commented 7 years ago

I use another branch:https://github.com/HarryR/maltrieve. Now it works. I believe that you can success too.

lhunt23 commented 7 years ago

Hello,

Please see below and let me know if you have any suggestions.

python maltrieve.py -d /home/lhunt/malware/030817/ Traceback (most recent call last): File "maltrieve.py", line 580, in main() File "maltrieve.py", line 520, in main cfg = config(args, 'maltrieve.cfg') File "maltrieve.py", line 131, in init self.cuckoo_dist = self.configp.get('Maltrieve', 'cuckoo_dist') File "/usr/lib/python2.7/ConfigParser.py", line 623, in get return self._interpolate(section, option, value, d) File "/usr/lib/python2.7/ConfigParser.py", line 669, in _interpolate option, section, rawval, e.args[0]) ConfigParser.InterpolationMissingOptionError: Bad value substitution: section: [Maltrieve] option : cuckoo_dist key : dist_port_9003_tcp_addr rawval : http://%(DIST_PORT_9003_TCP_ADDR)s:9003http://%25(DIST_PORT_9003_TCP_ADDR)s:9003

sudo python maltrieve.py Traceback (most recent call last): File "maltrieve.py", line 580, in main() File "maltrieve.py", line 520, in main cfg = config(args, 'maltrieve.cfg') File "maltrieve.py", line 131, in init self.cuckoo_dist = self.configp.get('Maltrieve', 'cuckoo_dist') File "/usr/lib/python2.7/ConfigParser.py", line 623, in get return self._interpolate(section, option, value, d) File "/usr/lib/python2.7/ConfigParser.py", line 669, in _interpolate option, section, rawval, e.args[0]) ConfigParser.InterpolationMissingOptionError: Bad value substitution: section: [Maltrieve] option : cuckoo_dist key : dist_port_9003_tcp_addr rawval : http://%(DIST_PORT_9003_TCP_ADDR)s:9003http://%25(DIST_PORT_9003_TCP_ADDR)s:9003

python3 maltrieve.py File "maltrieve.py", line 125 self.priority = args.priority ^ TabError: inconsistent use of tabs and spaces in indentation

From: hi-T0day [mailto:notifications@github.com] Sent: Wednesday, March 08, 2017 3:20 AM To: krmaxwell/maltrieve maltrieve@noreply.github.com Cc: Lindsay Hunt lhunt@paloaltonetworks.com; Author author@noreply.github.com Subject: Re: [krmaxwell/maltrieve] files aren't downloaded (#182)

I use another branch:https://github.com/HarryR/maltrievehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_HarryR_maltrieve&d=DwMCaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=hnR4CUG_5RW7St8kny3Zj2jYyESlnu1fnxyBNkp7e_w&s=ImMWXvy9JguyGyD18hgz8h_EksXC54OyxhexyLaAmVc&e=. Now it works. I believe that you can success too.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_krmaxwell_maltrieve_issues_182-23issuecomment-2D284976575&d=DwMCaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=hnR4CUG_5RW7St8kny3Zj2jYyESlnu1fnxyBNkp7e_w&s=06gLrt0YoyxjvFtkm4a7GAvquZxAmJ7e6NTS2c3pEmc&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AYWdcmZoQZP1fBLF6IFcFh5qZLgSybn7ks5rjmSngaJpZM4L3jQB&d=DwMCaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=hnR4CUG_5RW7St8kny3Zj2jYyESlnu1fnxyBNkp7e_w&s=7jmQV-TS7HXIQUgqqzHYvJ4krDZGvRk2JCBgST_Vbdk&e=.

hi-T0day commented 7 years ago

IF you add "#" before line8,9 in file "maltrieve.cfg" can maltrieve work?

rkalugdan commented 7 years ago

upon review, they were already commented out.

[Maltrieve] dumpdir = archive logfile = maltrieve.log logheaders = true User-Agent = Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)

viper = http://127.0.0.1:8080

cuckoo = http://127.0.0.1:8090

vxcage = http://127.0.0.1:8080

crits = https://127.0.0.1

crits_user = maltrieve

crits_key =

crits_source = maltrieve

Filter Lists are based on mime type NO SPACE BETWEEN ,

black_list = text/html,text/plain

white_list = application/pdf,application/x-dosexec

panw-ren commented 7 years ago

utilized the other branch as mentioned by hi-T0day but still no luck

user1@ubuntu-template:~/maltrieve-0.7$ sudo ./maltrieve.py -d /home/user1/malware Processing source URLs Completed source processing Downloading samples, check log for details Completed downloads user1@ubuntu-template:~/maltrieve-0.7$ cd /home/user1/malware/ user1@ubuntu-template:~/malware$ ls user1@ubuntu-template:~/malware$

hashesh.json/urls.json files are empty

2017-03-13 16:13:32 140601425241856 Loaded urls from urls.json 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): support.clean-mx.de 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): www.malwaredomainlist.com 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): vxvault.siri-urz.net 2017-03-13 16:13:32 140601425241856 Starting new HTTPS connection (1): zeustracker.abuse.ch 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): urlquery.net 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): malwareurls.joxeankoret.com 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): malc0de.com 2017-03-13 16:13:32 140601425241856 http://www.malwaredomainlist.com:80 "GET /hostslist/mdl.xml HTTP/1.1" 200 5735 2017-03-13 16:13:33 140601425241856 https://zeustracker.abuse.ch:443 "GET /monitor.php?urlfeed=binaries HTTP/1.1" 200 3882 2017-03-13 16:13:33 140601425241856 http://malwareurls.joxeankoret.com:80 "GET /normal.txt HTTP/1.1" 200 11192 2017-03-13 16:13:33 140601425241856 http://malc0de.com:80 "GET /rss/ HTTP/1.1" 200 None 2017-03-13 16:13:33 140601425241856 http://urlquery.net:80 "GET / HTTP/1.1" 200 4703 2017-03-13 16:13:34 140601425241856 http://support.clean-mx.de:80 "GET /clean-mx/rss?scope=viruses&limit=0%2C64 HTTP/1.1" 200 918 2017-03-13 16:13:34 140601425241856 Dumping past URLs to urls.json 2017-03-13 16:13:34 140601425241856 Dumping hashes to hashes.json

user1@ubuntu-template:~/maltrieve-0.7$ more hashes.json [] user1@ubuntu-template:~/maltrieve-0.7$ more urls.json []

lhunt23 commented 7 years ago

I tried the other branch as well. The script runs but files aren’t downloaded.

From: panw-ren [mailto:notifications@github.com] Sent: Monday, March 13, 2017 7:16 PM To: krmaxwell/maltrieve maltrieve@noreply.github.com Cc: Lindsay Hunt lhunt@paloaltonetworks.com; Author author@noreply.github.com Subject: Re: [krmaxwell/maltrieve] files aren't downloaded (#182)

utilized the other branch as mentioned by hi-T0day but still no luck

user1@ubuntu-template:/maltrieve-0.7$mailto:user1@ubuntu-template: sudo ./maltrieve.py -d /home/user1/malware Processing source URLs Completed source processing Downloading samples, check log for details Completed downloads user1@ubuntu-template:/maltrieve-0.7$<mailto:/maltrieve-0.7$> cd /home/user1/malware/ user1@ubuntu-template:/malware$ ls user1@ubuntu-template:/malware$

hashesh.json/urls.json files are empty

2017-03-13 16:13:32 140601425241856 Loaded urls from urls.json 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): support.clean-mx.de 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): www.malwaredomainlist.comhttps://urldefense.proofpoint.com/v2/url?u=http-3A__www.malwaredomainlist.com&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=36Z9msaMR4YTLtNr3Vl3XPHICnkpveWAe7rX5Q2Bc8o&s=R6DJP1xGammKLKO9gUDRnpLt92tJg5J4MwJCBancmS4&e= 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): vxvault.siri-urz.net 2017-03-13 16:13:32 140601425241856 Starting new HTTPS connection (1): zeustracker.abuse.ch 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): urlquery.net 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): malwareurls.joxeankoret.com 2017-03-13 16:13:32 140601425241856 Starting new HTTP connection (1): malc0de.com 2017-03-13 16:13:32 140601425241856 http://www.malwaredomainlist.com:80https://urldefense.proofpoint.com/v2/url?u=http-3A__www.malwaredomainlist.com-3A80&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=36Z9msaMR4YTLtNr3Vl3XPHICnkpveWAe7rX5Q2Bc8o&s=UNzYjo7iIXamatlKLkZB8f2nTD4In38pUyJAQ8vfU6Y&e= "GET /hostslist/mdl.xml HTTP/1.1" 200 5735 2017-03-13 16:13:33 140601425241856 https://zeustracker.abuse.ch:443https://urldefense.proofpoint.com/v2/url?u=https-3A__zeustracker.abuse.ch-3A443&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=36Z9msaMR4YTLtNr3Vl3XPHICnkpveWAe7rX5Q2Bc8o&s=nZhvJKTpPB2G48o2eYf9q4Vbl4Ghv5gqeVemvr4q_c0&e= "GET /monitor.php?urlfeed=binaries HTTP/1.1" 200 3882 2017-03-13 16:13:33 140601425241856 http://malwareurls.joxeankoret.com:80https://urldefense.proofpoint.com/v2/url?u=http-3A__malwareurls.joxeankoret.com-3A80&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=36Z9msaMR4YTLtNr3Vl3XPHICnkpveWAe7rX5Q2Bc8o&s=CpGGDNuwiFTAU0uPo_OTyTrI6Fd5Pqwz9wK3oALFLLs&e= "GET /normal.txt HTTP/1.1" 200 11192 2017-03-13 16:13:33 140601425241856 http://malc0de.com:80https://urldefense.proofpoint.com/v2/url?u=http-3A__malc0de.com-3A80&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=36Z9msaMR4YTLtNr3Vl3XPHICnkpveWAe7rX5Q2Bc8o&s=RPMsHT5AlMw5Zv2-bQwHRAKahuxjWm4X878Ugw6zQv8&e= "GET /rss/ HTTP/1.1" 200 None 2017-03-13 16:13:33 140601425241856 http://urlquery.net:80https://urldefense.proofpoint.com/v2/url?u=http-3A__urlquery.net-3A80&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=36Z9msaMR4YTLtNr3Vl3XPHICnkpveWAe7rX5Q2Bc8o&s=WVKP32K7dT1Ld_Vr7mZGf9pu0_v7FCF1NDroJIZdwq4&e= "GET / HTTP/1.1" 200 4703 2017-03-13 16:13:34 140601425241856 http://support.clean-mx.de:80https://urldefense.proofpoint.com/v2/url?u=http-3A__support.clean-2Dmx.de-3A80&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=36Z9msaMR4YTLtNr3Vl3XPHICnkpveWAe7rX5Q2Bc8o&s=j8aswoXreIR-aUyVOA-Db0HHo9wSJsBcjOH6C1BRSrM&e= "GET /clean-mx/rss?scope=viruses&limit=0%2C64 HTTP/1.1" 200 918 2017-03-13 16:13:34 140601425241856 Dumping past URLs to urls.json 2017-03-13 16:13:34 140601425241856 Dumping hashes to hashes.json

user1@ubuntu-template:/maltrieve-0.7$mailto:user1@ubuntu-template: more hashes.json [] user1@ubuntu-template:/maltrieve-0.7$<mailto:/maltrieve-0.7$> more urls.json []

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_krmaxwell_maltrieve_issues_182-23issuecomment-2D286273343&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=36Z9msaMR4YTLtNr3Vl3XPHICnkpveWAe7rX5Q2Bc8o&s=ts5kOZDstSM7rgUsZXgEmooxocsWLuiq-bnXCCq5wk8&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AYWdcnpTjnA9RPwMBgWF8ZMpoYxxpkYjks5rlc5MgaJpZM4L3jQB&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=F3HbLa-PYZ_cfGJw-BWkR8CsJX-ZYnlKAn5rGrHLKdo&m=36Z9msaMR4YTLtNr3Vl3XPHICnkpveWAe7rX5Q2Bc8o&s=AVablx8BNE0AZU0iUgT54rmnkCf5XFo4ispNrD9gfyM&e=.

panw-ren commented 7 years ago

are people still able to get help on issues w/ maltrieve?

rkalugdan commented 7 years ago

bump

getChester commented 7 years ago

confirming that nothing is being downloaded.

futex commented 7 years ago

Same problem, nothing is downloaded.

atefsaleh commented 6 years ago

I realize questions are 2 years old but i have the same case of this issue, did anybody came up with a solution or cause ? Thanks