Closed neil-unomaha closed 4 years ago
sudo docker exec -it verbose-robot /bin/bash
pip install 'cifsdk>=4.0.0a0'
CIF
right now, but the issue is that in our test environment: there are no existing feeds, so it doesn't return anything.
cif --itype ipv4 --tags malware
- Is CIF currently pulling feeds in real time, Or is pulling from feeds disabled by default? If it is disabled, how do we enable it?
- How do we add additional feeds?
The following example shows how to setup a CIF test environment with Ubuntu 16.04 server running in a virtual machine
First you will need to download an Ubuntu 16.04 server image and create a virtual machine.
Once you startup your virtual machine and login, there are a couple more steps for our guide. We wanted a GUI, so we installed the ubuntu-desktop extension:
sudo apt-get update
sudo apt-get install ubuntu-desktop
Next you need to install docker:
sudo apt install docker.io
Install the CIF 4 container from docker hub
sudo docker pull csirtgadgets/verbose-robot
Before running the docker container, you need to create some environment variables.
The first is CIF_TOKEN
which will contain a randomly generated string. This string ultimately becomes the bearer token passed into for all of your GET
and POST
requests via the request header for security. You can generate a random string with the following command on Ubuntu:
head -n 25000 /dev/urandom | openssl dgst -sha256 | awk -F ' ' '{print $2}'
An example ouptut string is the following:
525ff70def1b2b4eff3119451eabfa0ce3fa6316efb55fda075db08ac4a2feda
The other two required environment variables are MAXMIND_USER_ID
AND MAXMIND_LICENSE_KEY
. CIF depends on as mentioned here. Head over to Maxmind, create a free account, and within the settings you can find your account id and license key.
Here is an example command to setup these environment variables. Note that you'll want to swap out the values for MAXMIND_USER_ID
and MAXMIND_LICENSE_KEY
Setup those environment variables.
export CIF_TOKEN=`head -n 25000 /dev/urandom | openssl dgst -sha256 | awk -F ' ' '{print $2}'`
export MAXMIND_USER_ID=201001
export MAXMIND_LICENSE_KEY=3r8ESHRiFIsF
With the environment variables all setup, you can now run your CIF docker image:
sudo docker run -e CIF_TOKEN="${CIF_TOKEN}" -e MAXMIND_USER_ID="${MAXMIND_USER_ID}" -e MAXMIND_LICENSE_KEY="${MAXMIND_LICENSE_KEY}" -it -p 5000:5000 -d --name verbose-robot csirtgadgets/verbose-robot:latest
-e
flag-p
flag-d
verbose-robot
To confirm our docker container is running, we can run sudo docker ps
In order to interact with CIF, we can do so in two ways. The command prompt or with Swagger.
To do this we need to bash
into our running container. We can do that with the following:
sudo docker exec -it verbose-robot /bin/bash
Now that we are inside the container, we can execute the cif
command with various options in order to query the CIF database. Here are some example commands:
cif --itype ipv4 --tags scanner
cif --itype url --tags phishing
cif --itype url --tags malware
cif --itype ipv4 --tags botnet
Note that by default, CIF is pulling feeds from providers you specified every three minutes.
On the VM running CIF you can visit http://localhost:5000 which displays a rest api gui.
It is important to note that the lock symbol next to each endpoint indicates that the a token is required to be passed in for each request. This is the string that we created and stored within the CIF_TOKEN
environment variable earlier. Click the Authorize button and add the token.
Once you add the token, you should be able to interact with the api in the GUI. click the Try it Out button which toggles the endpoint, then click Execute.
You can then scroll down to see the response:
The CIF file that specifies endpoints is in app.py
. We think, in this docker container, the specific file is located here:
/usr/local/lib/python3.6/site-packages/verbose_robot-4.0.1-py3.6.egg/cif/httpd/app.py
Adding an endpoint should be as simple as the following:
@app.route('/')
def hello_world():
return 'Hello, World!'
Restart the server and try it out. Initial attempts did not work.
One possible explanation is becuase CIF is using Flask-RESTPlus, so the config might actually be this:
@api.route('/hello')
class HelloWorld(Resource):
def get(self):
return {'hello': 'world'}
We will have to try this.
Still need to figure out how to instead pass in the token as a parameter within the request, or remove the token requirement altogether.
# /usr/local/lib/python3.6/site-packages/verbose_robot-4.0.1-py3.6.egg/cif/httpd/app.py
# around line 39, add the following:
from .palo import api as palo_api
# /usr/local/lib/python3.6/site-packages/verbose_robot-4.0.1-py3.6.egg/cif/httpd/app.py
# around line 84, add the following:
palo_api,
# Create the following file:
# /usr/local/lib/python3.6/site-packages/verbose_robot-4.0.1-py3.6.egg/cif/httpd/palo.py
# All of the running CIF services are handled by supervisord.
# You likely will need to restart supervisord so that
# the changes are read in. To do that, run the following
# commands which kills the supervisord process.
# supervisord restarts automatically.
PID=`ps aux | grep supervisord | grep -v grep | awk -F ' ' '{print $2}'`
kill -HUP $PID
You can now make requests to the endpoint, but currently you are still required to pass in the token:
Just to make it so commands are easier to copy/paste for testing
PID=
ps aux | grep supervisord | grep -v grep | awk -F ' ' '{print $2}'kill -HUP $PID
(Had line break in previous comment)
curl -X GET "http://localhost:5000/palo/" -H "accept: application/json" -H "Authorization: 46508ee7d447ef4ed9666f3cc4716f0ea246fa2fb5a1254036a384d7897d"
To remove the requirement of a token being passed into the header, It should be as simple as the documentation shows: https://flask-restplus.readthedocs.io/en/stable/swagger.html#documenting-authorizations
There must be some additional step elsewhere, because it still isn't working for me...
Well, a step closer. This at least works, though it is completely short-circuiting the before_request
function
# palo.py
# app.py
# Notice the return statement right at the beginning.
# That is the only way I found to make it work
# request without authorization header
Within request.endpoint in
I tried adding /palo
, palo
, palo/
, and just to be sure I also tried palo/pa
, /palo/pa
. None of those worked.
When querying CIF by multiple tags
, it is smart enough to not duplicate the same IP address. The following outputs the IPv4
address, as well as the tags
cif --limit 150000 --itype ipv4 --tags scanner,bruteforce --f csv | awk -F ',' '{print $4 " " $10}'
Implemented logging. It turns out that the path for our custom endpoint happens to be palo_palo
. So: we weren't inputting the proper endpoint in the whitelist.
I figured this out by, as Doctor Hale first suggested: getting logging squared away. The easiest thing I did was I created my own log file and output request.endpoint
. That was what showed me that it was palo_palo
.
There are three different timestamps saved per indicator. reported_at
makes the most sense to sort by. And yes: this does need to be manually sorted. It comes back in different orders, and there does not appear to be an option in CIF to sort.
cif --limit 150000 --itype ipv4 --tags botnet,phishing,malware,scanner,bruteforce,darknet --f csv --columns reported_at,indicator
Output for each line looks like the following:
2020-03-25T02:40:42.012034Z,12.34.56.78
You can sort by the timestamp with the following:
sort -t, -k 1.1,1.26 <file>
CIF 5 was released 14hours ago. I downloaded Ubuntu 18.04 and attempted to deploy it via the instructions here. Unfortunately I ran into multiple errors, so it is not as easy as the directions make it sound.
Error when following "Up and Running" directions
# got to this step
docker-compose pull
ERROR: Couldn't connect to Docker daemon at http+docker://localunixsocket - is it running?
Error when following "Building Locally" directions
# got to this step
make docker-tag
(cd docker && bash tag.sh)
tag.sh: line 5: cif-router: command not found
Makefile:32: recipe for target 'docker-tag' failed
make: [docker-tag] Error 127 (ignored)
This script is located at /home/cif/palo_indicators/update_palo_indicators.sh The purpose of this file is to be executed every 10 minutes via a cron job. The IP indicators are stored in files up to a maximum of 5,000 indicators each file. NU's limit is 150,000 IP addresses. Thus, 30 files because 30 * 5,000 = 150,000 The files are located at */home/cif/paloindicators/ips.txt**
#!/bin/bash
# EXPLAINING `cif` command options
# --limit 150,000 -> limit the returned indicators (IP addresses in this case) to 150,000
# --itype ipv4 -> return only ipv4 indicators
# --tags botnet,phishing,malware,scanner,bruteforce,darknet -> return indicators with any of the specified tags
# -f csv -> returned output to be in csv format
# --columns reported_at,indicator -> per returned indicator: only return the reported_at timestamp and indicator
# > /home/cif/palo_indicators/all_ip_indicators.txt -> output to file in indicated path
# EXPLAINING `sort` coommand
# sort by the reported_at timestamp token
# EXPLAINING `sed` command
# example output per line at this point:
# 2020-04-05T14:20:18.365410Z,12.34.56.78
# Palo Alto ingestible format is one IP address per line
# Therefore, must get rid of everything per line except for IP address
# This sed command removes everything up to and including the first comma
# Thus, leaving only the IP address per line
/usr/local/bin/cif --limit 150000 --itype ipv4 --tags botnet,phishing,malware,scanner,bruteforce,darknet --f csv --columns reported_at,indicator | sort -t, -k 1.1,1.26 | sed 's/^[^,]*,//g' > /home/cif/palo_indicators/all_ip_indicators.txt
# Paging feature: allow maximum of 5000 IPs per file
# 5000 IPs allowed per file
# 5000 * 30 = 150,000
for num in {1..30}
do
endLine=$(($num * 5000))
startLine=$(($endLine - 4999))
endSedLine=$(($endLine + 1))
pagingSedOpts="$(($startLine)),$(($endLine))p;$(($endSedLine))q"
/bin/cat /home/cif/palo_indicators/all_ip_indicators.txt | sed -n $pagingSedOpts > /home/cif/palo_indicators/ips_$num.txt
done
# Must change file ownership to cif user or else cif api cannot access files
# chown cif:cif /home/cif/palo_indicators/ips_*
Here is the cronjob that executes the script
*/10 * * * * /bin/bash /home/cif/palo_indicators/update_palo_indicators.sh
root
usercif
command is not returning any indicators. Thus: all_ip_indicators.txt
as well as ips_*
files are all empty. Just trial and error to see about making a curl request from the palo endpoint and receive a csv file back. From local host there is the indicators for indicator related operations where I can make a curl request there to get a csv file with logs relevant to ipv4 addresses, and a series of tags. The request looks like this:
So I tried to look for how to make a curl request from the palo.py file. I found a useful import named shlex
that would allow me to run a curl command from the python file, palo.py.
From the screenshot, I use -o to save the file under /home/cif/palo_indicators/testfile.csv and the GET command to refer to the indicators python file to get the csv file. A quick cat of the testfile.csv shows it was outputted.
Within SwaggerGUI I was able to run the palo.py command successfully, and the testfile.csv file was in the indicated folder.
A quick cat of the testfile.csv shows it was outputted.
Probably need to do testing to get it to use $CIF_TOKEN rather than manually add it. From there, could look into extracting the indicators and timestamps from the csv into a txt file.
Collaborated: @skyemakable @TalonF
import json
input_file=open('palo_all_indicators.json', 'r')
output_file=open('palo_paged_indicators.txt', 'w')
json_decode=json.load(input_file)
all_indicators_dirty = []
all_indicators_clean = []
for item in json_decode:
my_dict = {}
my_dict['id'] =item.get('id')
my_dict['indicator'] =item.get('indicator')
all_indicators_dirty.append(my_dict)
all_indicators_dirty.sort(key=lambda x: x["id"])
for obj in all_indicators_dirty:
all_indicators_clean.append(obj["indicator"])
length_of_indicators = len(all_indicators_clean)
# initialize index count based on paging
index_count = (param * 5000) - 5000
for num in range(5000):
if(index_count > length_of_indicators - 1):
break
else:
# print(all_indicators_clean[index_count])
index_count += 1
output_file.write(indicator)
output_file.write("\n")
output_file.close()
import time, shlex, subprocess
from flask_restplus import Namespace, Resource
from .constants import HTTPD_TOKEN, ROUTER_ADDR
from flask import send_file
api = Namespace('palo', description='Palo API')
@api.route('/<string:page_num>')
@api.response(401, 'Unauthorized')
@api.response(200, 'OK')
class Palo(Resource):
@api.doc(security=[])
def get(self, page_num):
page_num_is_digit = False
for character in page_num:
if character.isdigit():
page_num_is_digit = True
else:
page_num_is_digit = False
break
if(page_num_is_digit == False):
return "Error: invalid page number"
cmd = '''curl -o /home/cif/palo_indicators/palo_all_indicators.json -X GET "http://localhost:5000/indicators/?fmt=json&tags=botnet%2Cphishing%2Cmalware%2Cscanner%2Cbruteforce%2Cdarknet&itype=ipv4" -H "accept: application/json" -H "Authorization: 46508ee7d447ef4ed9666f3cc4716f0ea246fa2fb5a1254036a384d7897dbaee"'''
args = shlex.split(cmd)
process = subprocess.Popen(args, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = process.communicate()
__init_page_output(page_num)
return send_file("/home/cif/palo_indicators/palo_paged_indicators.txt")
def __init_page_output(page_num):
input_file=open('palo_all_indicators.json', 'r')
output_file=open('palo_paged_indicators.txt', 'w')
json_decode=json.load(input_file)
all_indicators_dirty = []
all_indicators_clean = []
for item in json_decode:
my_dict = {}
my_dict['id'] =item.get('id')
my_dict['indicator'] =item.get('indicator')
all_indicators_dirty.append(my_dict)
all_indicators_dirty.sort(key=lambda x: x["id"])
for obj in all_indicators_dirty:
all_indicators_clean.append(obj["indicator"])
length_of_indicators = len(all_indicators_clean)
# initialize index count based on paging
index_count = (int(page_num) * 5000) - 5000
for num in range(5000):
if(index_count > length_of_indicators - 1):
break
else:
# print(all_indicators_clean[index_count])
index_count += 1
output_file.write(indicator)
output_file.write("\n")
output_file.close()
#from cifsdk.client.http import HTTP as Client
#from cifsdk.constants import ROUTER_ADDR, VALID_FILTERS
from flask import request, session, current_app
from .indicators import *
import time, json, os, logging, requests
from flask_restplus import Namespace, Resource
from flask import send_file
from .constants import ROUTER_ADDR
import logging
import arrow
import re
import traceback
import copy
import zmq
from flask_restplus import Namespace, Resource, fields
from flask import request, session, current_app
from cif.constants import FEEDS_LIMIT, FEEDS_WHITELIST_LIMIT, \
HTTPD_FEED_WHITELIST_CONFIDENCE, FEEDS_WHITELIST_DAYS
from cifsdk.constants import ROUTER_ADDR, VALID_FILTERS
from cifsdk.client.zmq import ZMQ as Client
from cifsdk.exceptions import AuthError, TimeoutError, InvalidSearch, \
SubmissionFailed, CIFBusy
from pprint import pprint
from csirtg_indicator.feed import aggregate
from csirtg_indicator.feed import process as feed
from csirtg_indicator.feed.fqdn import process as feed_fqdn
from csirtg_indicator.feed.ipv4 import process as feed_ipv4
from csirtg_indicator.feed.ipv6 import process as feed_ipv6
api = Namespace('palo', description='Palo API')
@api.route('/<string:page_num>')
@api.response(401, 'Unauthorized')
@api.response(200, 'OK')
class Palo(Resource):
@api.doc(security=[])
def get(self, page_num):
# filters definition
# <fill in the blank> - format an object defined as follows:
# filters ['parameter'] = <parameter_value> # where parameter_value is what you are passing in from curl, p$
# in the indicator code
# fmt=json&tags=botnet%2Cphishing%2Cmalware%2Cscanner%2Cbruteforce%2Cdarknet&itype=ipv4"
f = open("/home/cif/palo_debug.txt", "a")
filters = {
#'indicators': 'example.com',
'tags': 'botnet,phishing,malware,scanner,bruteforce,darknet',
'itype': 'ipv4'
}
f.write("This is the router address: ")
f.write(str(ROUTER_ADDR))
# get information from the database using the same structure used in indicators
#cli = Client('https://localhost:5000',token=os.getenv('CIF_TOKEN'), verify_ssl=False)
f.write("\nThis is what CLI is: ")
#f.write(str(cli))
#f.close()
# result from the database is returned as an object here
with Client(ROUTER_ADDR, os.getenv('CIF_TOKEN')) as client:
results = client.indicators_search(filters)
f.write(str(results))
f.close()
There's a big issue with using StringIO instead of writing to a file currently. We're currently returning the file when the rest call ends, but we close the file earlier: This is an issue when using StringIO because when you close the file it's removed from memory, and no longer accessable. We could not close it and just return, but that's never a good idea.
CIF Installation and Commands Notes
Installation
The installation notes specifically for the Docker installation strategy worked very well for me. I will provide a few pointers I gathered along the way
Ubuntu 16.04 Server
(example: Ubuntu 16.04 Desktop doesn't work)docker pull csirtgadgets/verbose-robot
export
your Maxmind credientals, or (as I did) put them in your.bashrc
file andsource
itCIF 4
is running in a docker container. You still need to shell into the container in order to install additional software.pip install 'cifsdk>=4.0.0a0'
Now that you have the Python CIF SDK installed, you should be good to go! Be sure that all your CIF commands are run within the docker container.
Usage / Commands
cif
. The client specifies a number of different options and arguments in order to get the desired response. Good Examples Here.--tags
nowThe
--feed
option as specified here is unrecognized. Apparently that is how you pull data from feeds in CIF?