medic / cht-sync

Data synchronization between CouchDB and PostgreSQL for the purpose of analytics.
GNU General Public License v3.0
2 stars 3 forks source link

Use json_batch in Logstash HTTP output #101

Closed njuguna-n closed 1 month ago

njuguna-n commented 1 month ago

Test out using json_batch with Logstash's HTTP plugin and compare performance with using Redis

njuguna-n commented 1 month ago

Below are the logstash throughput results when using json_batch as the output format in the HTTP plugin versus using the Redis plugin. The caveat with the result below is that it is only after a few minutes of testing but the result are very similar and I do not expect a major variance even with more time for the tests.

Given the results below I think it would be best to remove the Redis dependency and revert back to using the HTTP module to simplify the stack and reduce the number of services we need. We might re-introduce redis to keep track of the sequence token, but that can be done in that ticket.

With HTTP plugin

  "host" : "cht-sync-logstash-66f7cb88c7-fmn97",
  "version" : "8.11.1",
  "http_address" : "0.0.0.0:9600",
  "id" : "0ea2baee-0c7b-4e16-bad7-da0a553756b7",
  "name" : "cht-sync-logstash-66f7cb88c7-fmn97",
  "ephemeral_id" : "765b39eb-0a2c-486b-a3e1-292cf769d2e0",
  "status" : "green",
  "snapshot" : false,
  "pipeline" : {
    "workers" : 8,
    "batch_size" : 125,
    "batch_delay" : 50
  },
  "flow" : {
    "input_throughput" : {
      "current" : 1031.0,
      "lifetime" : 918.3
    },
    "filter_throughput" : {
      "current" : 1039.0,
      "lifetime" : 890.2
    },
    "output_throughput" : {
      "current" : 1039.0,
      "lifetime" : 890.2
    },
    "queue_backpressure" : {
      "current" : 0.04235,
      "lifetime" : 0.03563
    },
    "worker_concurrency" : {
      "current" : 8.0,
      "lifetime" : 6.815
    }
  }

With Redis

{
  "host" : "cht-sync-logstash-57577f489f-f9w89",
  "version" : "8.11.1",
  "http_address" : "0.0.0.0:9600",
  "id" : "23c83089-7f2b-4860-ab73-e3f985b58cb4",
  "name" : "cht-sync-logstash-57577f489f-f9w89",
  "ephemeral_id" : "aff3e7cc-efb1-4892-8ccb-6c419d82fa88",
  "status" : "green",
  "snapshot" : false,
  "pipeline" : {
    "workers" : 8,
    "batch_size" : 125,
    "batch_delay" : 50
  },
  "flow" : {
    "input_throughput" : {
      "current" : 864.6,
      "last_1_minute" : 921.5,
      "lifetime" : 816.6
    },
    "filter_throughput" : {
      "current" : 868.0,
      "last_1_minute" : 921.0,
      "lifetime" : 799.6
    },
    "output_throughput" : {
      "current" : 867.9,
      "last_1_minute" : 921.0,
      "lifetime" : 799.6
    },
    "queue_backpressure" : {
      "current" : 0.1923,
      "last_1_minute" : 0.1344,
      "lifetime" : 0.111
    },
    "worker_concurrency" : {
      "current" : 8.0,
      "last_1_minute" : 8.0,
      "lifetime" : 7.274
    }
  }
njuguna-n commented 1 month ago

I will do another round of tests just to confirm the result then create a ticket to remove Redis.

njuguna-n commented 1 month ago

Another round of testing had similar results so created this PR to remove Redis. I saw no need for an additional ticket.

njuguna-n commented 1 month ago

This docker image can be used to test out the HTTP plugin.

njuguna-n commented 1 month ago

PR closed without merging