Open MikeKemmerer opened 6 years ago
I will look into those options. I don't have a database of that size to hand so I'll need to build one.
I was able to accomplish a much faster lookup by using translate with a gigantic yaml dictionary that I periodically refresh via a separate pipeline: jdbc input, ruby filter to assemble a json object and sanitize the data, and file output to ramdisk.
I'm seeing upwards of 18k EPS with this setup.
@MikeKemmerer
Did you use index_columns
to add an index on your lookup columns?
Yes. I have even tried sorting the column we're using for lookup.
Michael Kemmerer
On Wed, May 23, 2018 at 7:53 AM, Guy Boertje notifications@github.com wrote:
@MikeKemmerer https://github.com/MikeKemmerer
Did you use index_columns to add an index on your lookup columns?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/logstash-plugins/logstash-filter-jdbc_static/issues/27#issuecomment-391377181, or mute the thread https://github.com/notifications/unsubscribe-auth/AJuwfRXqAWq9pOduKnZZQCbd8IycvtPbks5t1XfjgaJpZM4T_r_B .
Good to know, thanks for that.
I'm having the same sort of throughput issues with this plugin, I'm looking to be able to enrich events at 10k/s (across multiple LS nodes). Doesn't appear to be CPU related as it only consumes about 10% CPU (16CPU system) while running so I was wondering if perhaps there is some improvement in how it uses threads on the CPU to make use of the available CPU. The data source table for the lookup is a little under 3 million rows.
Ok. I'm now looking into this. I know that we can make improvements.
When I set workers to 2 and batch size to 1 and log the QueryPlan to Derby's logs, I see this...
Index Scan ResultSet for SERVERS using index SERVERS_IP_INDEX at read committed isolation level using instantaneous share row locking chosen by the optimizer
from each worker thread.
Wed Jul 25 17:38:09 BST 2018 Thread[Ruby-0-Thread-5@[main]>worker1: :1,5,main] (XID = 229), (SESSIONID = 7), select descr as description from servers WHERE ip = '10.2.3.40' ******* Project-Restrict ResultSet (3):
Number of opens = 1
Rows seen = 1
Rows filtered = 0
restriction = false
projection = true
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
restriction time (milliseconds) = 0
projection time (milliseconds) = 0
optimizer estimated row count: 1.00
optimizer estimated cost: 13.20
Source result set:
Index Row to Base Row ResultSet for SERVERS:
Number of opens = 1
Rows seen = 1
Columns accessed from heap = {0, 1}
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
optimizer estimated row count: 1.00
optimizer estimated cost: 13.20
Index Scan ResultSet for SERVERS using index SERVERS_IP_INDEX at read committed isolation level using instantaneous share row locking chosen by the optimizer
Number of opens = 1
Rows seen = 1
Rows filtered = 0
Fetch Size = 16
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
next time in milliseconds/row = 0
scan information:
Bit set of columns fetched=All
Number of columns fetched=2
Number of deleted rows visited=0
Number of pages visited=1
Number of rows qualified=1
Number of rows visited=2
Scan type=btree
Tree height=-1
start position:
>= on first 1 column(s).
Ordered null semantics on the following columns:
stop position:
> on first 1 column(s).
Ordered null semantics on the following columns:
qualifiers:
None
optimizer estimated row count: 1.00
optimizer estimated cost: 13.20
Wed Jul 25 17:38:09 BST 2018 Thread[Ruby-0-Thread-4@[main]>worker0: :1,5,main] (XID = 228), (SESSIONID = 5), select descr as description from servers WHERE ip = '10.2.3.40' ******* Project-Restrict ResultSet (3):
Number of opens = 1
Rows seen = 1
Rows filtered = 0
restriction = false
projection = true
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
restriction time (milliseconds) = 0
projection time (milliseconds) = 0
optimizer estimated row count: 1.00
optimizer estimated cost: 13.20
Source result set:
Index Row to Base Row ResultSet for SERVERS:
Number of opens = 1
Rows seen = 1
Columns accessed from heap = {0, 1}
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
optimizer estimated row count: 1.00
optimizer estimated cost: 13.20
Index Scan ResultSet for SERVERS using index SERVERS_IP_INDEX at read committed isolation level using instantaneous share row locking chosen by the optimizer
Number of opens = 1
Rows seen = 1
Rows filtered = 0
Fetch Size = 16
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
next time in milliseconds/row = 0
scan information:
Bit set of columns fetched=All
Number of columns fetched=2
Number of deleted rows visited=0
Number of pages visited=1
Number of rows qualified=1
Number of rows visited=2
Scan type=btree
Tree height=-1
start position:
>= on first 1 column(s).
Ordered null semantics on the following columns:
stop position:
> on first 1 column(s).
Ordered null semantics on the following columns:
qualifiers:
None
optimizer estimated row count: 1.00
optimizer estimated cost: 13.20
@guyboertje I suspect I am experiencing a similar problem (using Logstash 7.1.1) I need to process about 7500 events per second per Logstash instance, and I have 32 workers running in this pipeline per instance. In my case, I have two database tables that I use for enrichment; one with 1200 rows, another with 2500 rows. So, my translation tables are quite small compared to what others need, yet performance is still inadequate. I have another stream that produces about 125-130 events per second, and jdbc_static is able to keep up with enriching that stream.
If you have suggestions on how I can help you investigate this issue, I'd like to hear them.
I've found that the translate filter is far better for performing real-time joins on data. We're cranking away at over 7000 events per second with multiple translate filters of tens of thousands of rows each. It did take writing a Python script to periodically dump the database into a json formatted file that is compatible with the translate filter. Happy to talk 1:1 if you'd like: https://aka.ms/MichaelKemmerer
Michael Kemmerer
On Tue, Jan 14, 2020 at 12:02 PM TomJohnson-Syncbak < notifications@github.com> wrote:
@guyboertje https://github.com/guyboertje I suspect I am experiencing a similar problem (using Logstash 7.1.1) I need to process about 7500 events per second per Logstash instance, and I have 32 workers running in this pipeline per instance. In my case, I have two database tables that I use for enrichment; one with 1200 rows, another with 2500 rows. So, my translation tables are quite small compared to what others need, yet performance is still inadequate. I have another stream that produces about 125-130 events per second, and jdbc_static is able to keep up with enriching that stream.
If you have suggestions on how I can help you investigate this issue, I'd like to hear them.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/logstash-plugins/logstash-filter-jdbc_static/issues/27?email_source=notifications&email_token=ACN3A7I5RMMZZNUCSPIU57TQ5YK6HA5CNFSM4E76X7A2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI55V5I#issuecomment-574348021, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACN3A7P3LGXTODYK5ABPY3LQ5YK6HANCNFSM4E76X7AQ .
@MikeKemmerer thank you for the suggestion, I am going to go with a variation on your approach.
(edit): yes, this approach is working well for 8000 events per second per Logstash instance.
What is the expected event rate for this filter? I'm having an issue with event rate when using this filter for enrichment. We are fetching around 475,000 records from our source database, indexing the column used in the where clause, limiting the number of columns in the local db, and changing the number of workers. However, we are unable to top 500 events per second for any sustained period of time, and we need to sustain at least 3000 EPS per logstash node in our current deployment. We have 16 cores and have the heap size set to 28g, but the cpu utilization is only around 30% even when adding N x cores workers.
After diving into the Derby docs, is a shared lock likely to help improve performance of the SELECT statements in Derby? Or perhaps creating a prepared statement from the "query"?