InQuest / iocextract

Defanged Indicator of Compromise (IOC) Extractor.
https://inquest.readthedocs.io/projects/iocextract/
GNU General Public License v2.0
505 stars 91 forks source link

BUG: --extract-ipv4s does not work #73

Closed ZeroDot1 closed 1 year ago

ZeroDot1 commented 1 year ago

Unfortunately it doesn't work, I ran it for quite a while but except for stressing one CPU core 100% nothing happened, the IPs were not written to the file. iocextract --input '/home/user/des.txt' --output '/home/user/k1.txt' --extract-ipv4s

battleoverflow commented 1 year ago

Hi, @ZeroDot1

I just ran a quick assessment and had no issues extracting the IP addresses from the test data.

Here's my dataset ("des.txt"):

asdf192[.]168[.]0[.]1:80/pathtttttt192.168.0.1ssssss192[.]168[.]0[.]1loremipsum

My command:

iocextract --input 'des.txt' --output 'k1.txt' --extract-ipv4s

If possible, could you provide part of the file/data you're attempting to extract from? Or provide an example?

ZeroDot1 commented 1 year ago

Hi @azazelm3dj3d,

Here is a test file with real data. I hope this helps. It could also be that my files are just too big, on average my files are between 20 and 150MB.

tstfile.txt

battleoverflow commented 1 year ago

That may be the issue, but I've definitely tested up to 20mb-30mb in the past. I was able to extract IOCs from the test file using the command I mentioned in the previous response without any issues and it took less than a second to produce.

I also just tested a 20mb file from SecLists full of randomized characters and only 2 IP addresses throughout the entire file. It was able to find the IP addresses within a few seconds. Here are the results:

Command: iocextract --input 'des.txt' --output 'k1.txt' --extract-ipv4s 
User: 1.64s
System: 0.03s
CPU: 99%
Total: 1.678s

I'm going to close this issue as complete for now, but I will definitely look into testing larger datasets (~50mb+) when I have the time. Hopefully, you're able to resolve the issue and everything works out soon. My one recommendation is to try to cut the file size down if possible and see if that improves the results.