elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.14k stars 4.91k forks source link

[Filebeat][httpjson input] Possibility to chain requests #22590

Closed P1llus closed 2 years ago

P1llus commented 3 years ago

This is a issue to track a feature that would open up even more possibilities for the httpjson input, which is chained requests.

This is meant at a "next step" possibility after https://github.com/elastic/beats/pull/22320.

A chained request would be the possibility to specify a initial request, and from that request you could either extract a single value and reuse that in the URL or body of the follow-up request. The body of the follow-up request is what will be in the resulting document sent to elasticsearch.

An example would be threat intelligence, using Anomali's Limo as an example, since they have multiple URL's it would be good to communicate with the API that lists all URL's, and then create a request to each of them.

Request:

GET - https://limo.anomali.com/api/v1/taxii2/feeds/collections/

Response:

{
    "collections": [
        {
            "can_read": true,
            "can_write": false,
            "description": "",
            "id": "107",
            "title": "Phish Tank"
        },
        {
            "can_read": true,
            "can_write": false,
            "description": "",
            "id": "135",
            "title": "Abuse.ch Ransomware IPs"
        },
        {
            "can_read": true,
            "can_write": false,
            "description": "",
            "id": "136",
            "title": "Abuse.ch Ransomware Domains"
        }]
}

And from that I would like to call a URL using each of the ID fields like:

GET - https://limo.anomali.com/api/v1/taxii2/feeds/collections/107/objects
GET - https://limo.anomali.com/api/v1/taxii2/feeds/collections/135/objects
GET - https://limo.anomali.com/api/v1/taxii2/feeds/collections/136/objects

However there might be usecases in which the variable would be used in a BODY rather than as a query parameter.

It might also be that we need to separate settings between pre-request and the rest of requests. For example the initial request might not need authentication while we would want to use the response in a Auth Header for the upcoming requests, in which they need different request settings.

Currently I don't see a need to be able to set transforms for each request after the initial request, they should all share transforms for now.

Other usecases would be for example to call the API of virustotal to get a list of fileID's, then contacting another API per fileID to gather more details around each of them. Maybe @dcode could elaborate a bit more with API examples on this one?

elasticmachine commented 3 years ago

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

P1llus commented 3 years ago

@marc-gr Just wanted to ping you on this so that we have a reference issue, but its for the future currently and nothing with high prio.

botelastic[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

RichiCoder1 commented 2 years ago

I'm sure there's already use cases out there, but I have an example use case that I'd like to ingest Workflow Usage data from GitHub.

However, currently that requires you:

All while respecting Rate Limits.

This may be a lot to ask of Filebeat/beats, but it'd be awesome if it "just worked"!

andrewkroh commented 2 years ago

The feature was implemented in https://github.com/elastic/beats/pull/29816.