InQuest / ThreatIngestor

Extract and aggregate threat intelligence.
https://inquest.readthedocs.io/projects/threatingestor/
GNU General Public License v2.0
821 stars 135 forks source link

Create a queue-based source/operator that can be used as a simpler alternative to SQS #40

Closed rshipp closed 5 years ago

rshipp commented 6 years ago

Like SQS, but for named pipes.

This just makes it easier for people to get set up without needing an AWS account and SQS tubes already created.

Base off of #23. Rebase on top of #23 once that issue is done. I think this one is an easier starting point so we don't have to worry about SQS.

Basic Design Proposal

Source

Operator

needmorecowbell commented 5 years ago

https://stackoverflow.com/questions/39089776/python-read-named-pipe

Two files should be created, one reads from the pipe, the other writes to it (source/operator). Handled similarly to sql, but without the need to wait for the lock to expire.

Look at this for retrieval (source):

    sqs = boto3.resource('sqs', region_name=REGION_NAME, aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY)
    queue = sqs.get_queue_by_name(QueueName=QUEUE_NAME)

    messages = queue.receive_messages()
    print("Retrieved #"+str(len(messages)))
    for message in messages:

and this for creation (operator):

sqs = boto3.client('sqs', region_name=REGION_NAME, aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY)

def sendTask(body):
    # connect to sqs
    queue_url = sqs.get_queue_url(QueueName=QUEUE_NAME)['QueueUrl']
    content = json.dumps({'tasklink': body}) 
    sqs.send_message(
        QueueUrl=queue_url,
        DelaySeconds=0,
        MessageBody=content
    )
rshipp commented 5 years ago

We looked into this today and realized named pipes won't be viable for this purpose, since you can't write to them without simultaneously reading, and the per-line reading isn't really a supported usecase. Options to consider:

  1. Look into using UNIX sockets instead.
    • Disadvantages: Still meant for structured datagram transmission, not newline-separated, so we'd have to work around that; not cross-platform, though Windows might have something similar to sockets we could use.
  2. Find a cross platform queuing server with easy setup to serve as a simpler alternative to SQS.
    • Disadvantages: Non-native, requires additional third-party software; non-zero setup time involved to stand up a server.
deadbits commented 5 years ago

What about something like beanstalkd and greenstalk as the Python library for interaction with it?

Looks pretty easy for users to setup beanstalkd if they want to use it, and the Python library looks like a lot like Python's Queue with queue.put(job), or next_job = queue.reserve(), etc.

I was first going to say RabbitMQ but this looks simpler to just stand up a beanstalkd instance and go for the user, and easier to interact with from the code side.

needmorecowbell commented 5 years ago

We looked at beanstalkd, now that we know fifos don't work for this, it might be a good alternative. While sqs works fine now that in flight delays are set to 0, I think having something a little simpler that doesn't depend on having a separate registered account is a good move.

needmorecowbell commented 5 years ago

Beanstalkd doesn't have native windows support, this might be a fix for that: https://github.com/LomoX-Offical/beanstalkd-win/tree/iocp

Comes with a compiled exe -- but hasn't been worked on in 3 years.