Open iirekm opened 6 years ago
This functionality is making the jdbc input unusable for my data retrieval process because I continue to get back I/O errors due to how long this takes to run and my connections appear to be timing out during that time.
EDIT: I took the query for this count
and tried to manually run it - i stopped it after nonstop running for 30 minutes. It never finished. Is there a reason this count step is needed? Is there no config / setting that we can set to disable it?
Even if i don't configure the jdbc_paging_enabled parameter,it still worked.
This also prevents the JDBC input from working with a number of datastores that provide JDBC drivers but are not fully SQL-compliant such as Apache Jena, etc.
Is there any update on this? I am having this same issue. This count(*) is causing a heavy query to run twice on our database. Is there any way to disable it?
I'd like to ask the same question. 4.3.11 actually made the problem worse, since the count is now executing in all contexts, not just debug mode.
We're seeing consistent failures because of the runtime of the count query.
Hi I am having a related (or perhaps the same) issue. I am trying to read from a Firebird database with Logstash using the following
input {
jdbc {
jdbc_driver_library => "c:/bug/jaybird-full-3.0.6.jar"
jdbc_driver_class => "org.firebirdsql.jdbc.FBDriver"
jdbc_connection_string => "jdbc:firebirdsql://127.0.0.1/D:/DBASE.FDB"
jdbc_user => SYSDBA
jdbc_password => pwd12442
jdbc_paging_enabled => false
tracking_column => "unix_ts_in_secs"
use_column_value => true
tracking_column_type => "numeric"
schedule => "/5 "
statement => "SELECT * FROM TABLE1"
}
}
When I analyze the debug information in my Windows terminal I note that somehow 'SELECT count(*) AS "COUNT" FROM is added in front of my statement and AS "T1" LIMIT 1 is added behind my statement. For example:
SELECT count() AS "COUNT" FROM (select from TABLE1) AS "T1" LIMIT 1
Obviously the SQL error I get is that the Token is unknown. Does anyone have a solution for it?
Not sure if anyone else had this issue - I did and it is frustrating that Logstash can't put a kill on this count - it would be easy for them to let the user just insert their own count - preempting the query and causing them NO HARM WHATSOEVER. But, the workaround I found was to limit the query in the statement by the ID, so that the count was inherently limited. So basically
statement => "SELECT *t FROM table WHERE id > :sql_last_value and id < 10000000 ORDER BY id ASC"
Of course, this means you have to babysit and up the numbers as the search passes it, but it's better than having it all just fail.
I worked around it by using a prepared statement with no params. Since prepared statement does not support paging, the plugin does not issue the extra "count" query.
Any update in respect to this? It seems like it's been abandoned / ignored, while so many complain about it. In large tables, this is indeed a show stopper.
I worked around it by using a prepared statement with no params. Since prepared statement does not support paging, the plugin does not issue the extra "count" query.
This only works when you have a query that is not expecting data to be used from another input.
My work around was being forced to write everything in Python. Not exactly a logstash solution
I have the same issue right now. I have a table with 100M rows. The same table was indexed with Sphinx and it was a breeze. Adding a separate statement_count
where you can write your own SQL query that produces count for the table indeed would be the best solution.
This has been fixed here: https://github.com/logstash-plugins/logstash-integration-jdbc/pull/95
For complex SQL queries as in our case it's performance killer, everything takes twice as long.