logstash-plugins / logstash-input-http_poller

Create Logstash events by polling HTTP endpoints!
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-http_poller.html
Apache License 2.0
43 stars 65 forks source link

Add support for pagination #139

Open tonitert opened 1 year ago

tonitert commented 1 year ago

This pull request adds support for polling a paginated API based on a page number in the URL's query parameters. As the amount of data being fetched can be very large in my use case, the PR has support for using multiple threads and saving the current page to a file.

An example config can be seen here:

input {
  http_poller {
    urls => {
        example => {
            method => get
            url => "http://localhost:8000/example"
            pagination => {
              start_page => 1
              end_page => 30
              page_parameter => "page"
              concurrent_requests => 4
              last_run_metadata_path => "./last_run_metadata"
            }
            failure_mode => "retry"
            retry_delay => 3
            success_status_codes => [200, 201]
        }
    }
    schedule => { in => "0s"}
    codec => "json"
    keepalive => true
    metadata_target => "http_poller_metadata"
  }
}

This will send requests to http://localhost:8000/example?page=1 http://localhost:8000/example?page=2 and so on.

Pages can be fetched concurrently on multiple threads at the same time. The page in progress can be saved to a file to restore progress in case of Logstash being stopped or crashing. A file path can be specified, or else the file will be created in the Logstash data directory. It is possible to choose whether to delete the file when the job finishes.

If requests start failing at some point while querying all pages, they can be retried. Other possible failure modes are stopping the input and continuing on error. Success status codes can be specified, so that only certain status code responses are counted as successes.

axeh commented 1 year ago

@roaksoax any progress on this?