armbues / ioc_parser

Tool to extract indicators of compromise from security reports in PDF format
MIT License
428 stars 171 forks source link

Whitelist not working #24

Open ciphercodes opened 8 years ago

ciphercodes commented 8 years ago

I have a pdf document which includes an aol email address in the following format Sample: abcd[.]aol[.]com I am running the iocp parser without any options/flags and I see the output includes aol.com as Host. /ioc/Sample.pdf 2 Host aol.com

I verified that aol.com$ is listed in whitelist_Host.ini. I also added @aol.com in whitelist_Email.ini but my output file still lists "Host aol.com".

RoberticoRdk commented 6 years ago

Edit file: /usr/lib/python2.7/dist-packages/iocp/Parser.py

    def load_whitelists(self, fpath):
        whitelist = {}

        searchdir = os.path.join(fpath, "/whitelist_*.ini")
        print searchdir
        fpaths = glob.glob(searchdir)
        for fpath in fpaths:
            t = os.path.splitext(os.path.split(fpath)[1])[0].split('_',1)[1]
            patterns = [line.strip() for line in open(fpath)]
            whitelist[t]  = [re.compile(p) for p in patterns]
        return whitelist

Remove forward slash ! searchdir = os.path.join(fpath, "whitelist_*.ini"

The latter strings shouldn't start with a slash. If they start with a slash, then they're considered an "absolute path" and everything before them is discarded.