Slow performance with event rate > 500 EPS

MikeKemmerer commented 6 years ago

What is the expected event rate for this filter? I'm having an issue with event rate when using this filter for enrichment. We are fetching around 475,000 records from our source database, indexing the column used in the where clause, limiting the number of columns in the local db, and changing the number of workers. However, we are unable to top 500 events per second for any sustained period of time, and we need to sustain at least 3000 EPS per logstash node in our current deployment. We have 16 cores and have the heap size set to 28g, but the cpu utilization is only around 30% even when adding N x cores workers.

After diving into the Derby docs, is a shared lock likely to help improve performance of the SELECT statements in Derby? Or perhaps creating a prepared statement from the "query"?

guyboertje commented 6 years ago

I will look into those options. I don't have a database of that size to hand so I'll need to build one.

MikeKemmerer commented 6 years ago

I was able to accomplish a much faster lookup by using translate with a gigantic yaml dictionary that I periodically refresh via a separate pipeline: jdbc input, ruby filter to assemble a json object and sanitize the data, and file output to ramdisk.

I'm seeing upwards of 18k EPS with this setup.

guyboertje commented 6 years ago

@MikeKemmerer

Did you use index_columns to add an index on your lookup columns?

MikeKemmerer commented 6 years ago

Yes. I have even tried sorting the column we're using for lookup.

Michael Kemmerer

On Wed, May 23, 2018 at 7:53 AM, Guy Boertje notifications@github.com wrote:

@MikeKemmerer https://github.com/MikeKemmerer

Did you use index_columns to add an index on your lookup columns?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/logstash-plugins/logstash-filter-jdbc_static/issues/27#issuecomment-391377181, or mute the thread https://github.com/notifications/unsubscribe-auth/AJuwfRXqAWq9pOduKnZZQCbd8IycvtPbks5t1XfjgaJpZM4T_r_B .

guyboertje commented 6 years ago

Good to know, thanks for that.

steevo590 commented 6 years ago

I'm having the same sort of throughput issues with this plugin, I'm looking to be able to enrich events at 10k/s (across multiple LS nodes). Doesn't appear to be CPU related as it only consumes about 10% CPU (16CPU system) while running so I was wondering if perhaps there is some improvement in how it uses threads on the CPU to make use of the available CPU. The data source table for the lookup is a little under 3 million rows.

guyboertje commented 6 years ago

Ok. I'm now looking into this. I know that we can make improvements.

guyboertje commented 6 years ago

When I set workers to 2 and batch size to 1 and log the QueryPlan to Derby's logs, I see this...

Index Scan ResultSet for SERVERS using index SERVERS_IP_INDEX at read committed isolation level using instantaneous share row locking chosen by the optimizer

from each worker thread.

Wed Jul 25 17:38:09 BST 2018 Thread[Ruby-0-Thread-5@[main]>worker1: :1,5,main] (XID = 229), (SESSIONID = 7), select descr as description from servers WHERE ip = '10.2.3.40' ******* Project-Restrict ResultSet (3):
Number of opens = 1
Rows seen = 1
Rows filtered = 0
restriction = false
projection = true
    constructor time (milliseconds) = 0
    open time (milliseconds) = 0
    next time (milliseconds) = 0
    close time (milliseconds) = 0
    restriction time (milliseconds) = 0
    projection time (milliseconds) = 0
    optimizer estimated row count: 1.00
    optimizer estimated cost: 13.20
Source result set:
    Index Row to Base Row ResultSet for SERVERS:
    Number of opens = 1
    Rows seen = 1
    Columns accessed from heap = {0, 1}
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        optimizer estimated row count: 1.00
        optimizer estimated cost: 13.20
        Index Scan ResultSet for SERVERS using index SERVERS_IP_INDEX at read committed isolation level using instantaneous share row locking chosen by the optimizer
        Number of opens = 1
        Rows seen = 1
        Rows filtered = 0
        Fetch Size = 16
            constructor time (milliseconds) = 0
            open time (milliseconds) = 0
            next time (milliseconds) = 0
            close time (milliseconds) = 0
            next time in milliseconds/row = 0

        scan information:
            Bit set of columns fetched=All
            Number of columns fetched=2
            Number of deleted rows visited=0
            Number of pages visited=1
            Number of rows qualified=1
            Number of rows visited=2
            Scan type=btree
            Tree height=-1
            start position:
                >= on first 1 column(s).
                Ordered null semantics on the following columns: 
            stop position:
                > on first 1 column(s).
                Ordered null semantics on the following columns: 
            qualifiers:
                None
            optimizer estimated row count: 1.00
            optimizer estimated cost: 13.20

Wed Jul 25 17:38:09 BST 2018 Thread[Ruby-0-Thread-4@[main]>worker0: :1,5,main] (XID = 228), (SESSIONID = 5), select descr as description from servers WHERE ip = '10.2.3.40' ******* Project-Restrict ResultSet (3):
Number of opens = 1
Rows seen = 1
Rows filtered = 0
restriction = false
projection = true
    constructor time (milliseconds) = 0
    open time (milliseconds) = 0
    next time (milliseconds) = 0
    close time (milliseconds) = 0
    restriction time (milliseconds) = 0
    projection time (milliseconds) = 0
    optimizer estimated row count: 1.00
    optimizer estimated cost: 13.20
Source result set:
    Index Row to Base Row ResultSet for SERVERS:
    Number of opens = 1
    Rows seen = 1
    Columns accessed from heap = {0, 1}
        constructor time (milliseconds) = 0
        open time (milliseconds) = 0
        next time (milliseconds) = 0
        close time (milliseconds) = 0
        optimizer estimated row count: 1.00
        optimizer estimated cost: 13.20
        Index Scan ResultSet for SERVERS using index SERVERS_IP_INDEX at read committed isolation level using instantaneous share row locking chosen by the optimizer
        Number of opens = 1
        Rows seen = 1
        Rows filtered = 0
        Fetch Size = 16
            constructor time (milliseconds) = 0
            open time (milliseconds) = 0
            next time (milliseconds) = 0
            close time (milliseconds) = 0
            next time in milliseconds/row = 0

        scan information:
            Bit set of columns fetched=All
            Number of columns fetched=2
            Number of deleted rows visited=0
            Number of pages visited=1
            Number of rows qualified=1
            Number of rows visited=2
            Scan type=btree
            Tree height=-1
            start position:
                >= on first 1 column(s).
                Ordered null semantics on the following columns: 
            stop position:
                > on first 1 column(s).
                Ordered null semantics on the following columns: 
            qualifiers:
                None
            optimizer estimated row count: 1.00
            optimizer estimated cost: 13.20

TomJohnson-Syncbak commented 4 years ago

@guyboertje I suspect I am experiencing a similar problem (using Logstash 7.1.1) I need to process about 7500 events per second per Logstash instance, and I have 32 workers running in this pipeline per instance. In my case, I have two database tables that I use for enrichment; one with 1200 rows, another with 2500 rows. So, my translation tables are quite small compared to what others need, yet performance is still inadequate. I have another stream that produces about 125-130 events per second, and jdbc_static is able to keep up with enriching that stream.

If you have suggestions on how I can help you investigate this issue, I'd like to hear them.

MikeKemmerer commented 4 years ago

I've found that the translate filter is far better for performing real-time joins on data. We're cranking away at over 7000 events per second with multiple translate filters of tens of thousands of rows each. It did take writing a Python script to periodically dump the database into a json formatted file that is compatible with the translate filter. Happy to talk 1:1 if you'd like: https://aka.ms/MichaelKemmerer

Michael Kemmerer

On Tue, Jan 14, 2020 at 12:02 PM TomJohnson-Syncbak < notifications@github.com> wrote:

@guyboertje https://github.com/guyboertje I suspect I am experiencing a similar problem (using Logstash 7.1.1) I need to process about 7500 events per second per Logstash instance, and I have 32 workers running in this pipeline per instance. In my case, I have two database tables that I use for enrichment; one with 1200 rows, another with 2500 rows. So, my translation tables are quite small compared to what others need, yet performance is still inadequate. I have another stream that produces about 125-130 events per second, and jdbc_static is able to keep up with enriching that stream.

If you have suggestions on how I can help you investigate this issue, I'd like to hear them.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/logstash-plugins/logstash-filter-jdbc_static/issues/27?email_source=notifications&email_token=ACN3A7I5RMMZZNUCSPIU57TQ5YK6HA5CNFSM4E76X7A2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI55V5I#issuecomment-574348021, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACN3A7P3LGXTODYK5ABPY3LQ5YK6HANCNFSM4E76X7AQ .

TomJohnson-Syncbak commented 4 years ago

@MikeKemmerer thank you for the suggestion, I am going to go with a variation on your approach.

(edit): yes, this approach is working well for 8000 events per second per Logstash instance.

logstash-plugins / logstash-filter-jdbc_static

Slow performance with event rate > 500 EPS #27