Open palin opened 4 years ago
The clean_run
works fine for the first run. It's set properly in the main "register" method of the plugin. The issue is that it's not called for subsequent runs of the pipeline.
I decided to fix it just by not saving the state of the pipeline if record_last_run is false.
Just stumbled upon this issue myself on version 4.3.19
. Any plans on working on it?
I've been having a look at the code myself, but since clean_run
and record_last_run
are kind of conflicting concepts to me (despite not being entirely opposite) and design intentions for each parameter are also not entirely clear (which one must take precedence?) I feel like I'd need some indications in order to provide a working solution that is in line with this project's line of development.
Any suggestions?
Just conducted some tests, and even with clean_run => true
and record_last_run => false
, the value of :sql_last_value
is updated from the second execution onwards.
I'm guessing this happens since, internally, the plugin is still storing the last value of the tracking_column
and it is not reseting it except for the first execution, when it still has no associated value and defaults to 0.
So far I haven't found any working combination of parameters that, with the current state of things, allows to do what the clean_run
parameters states to do: not even setting clean_run => true
, record_last_run => false
, deleting any existing last_value file and restarting the plugin works; again, from the second execution onwards the :sql_last_value
still gets updated.
The only solution that has worked for me so far is not to use :sql_last_value
on any queries that need to achieve this behavior at all.
@adeharo9 I've never found any working combinations of parameters to go around this issue. What I did is just a patch in the code of the plugin. We use a modified version of it and I added this code:
@value_tracker.set_value(sql_last_value) if @record_last_run
This way if record_last_run = false
, the sql_last_value
will not be stored and your pipeline will always start from 0 or '1970-01-01`.
https://github.com/logstash-plugins/logstash-integration-jdbc/issues/121#issuecomment-2042584611 You need to use clean_run => true record_last_run => true last_run_metadata_path => "/..."
I had to look in the code to extract this.
But as it instantiate that class (ValueTracking) once, it will persist the sql_last_value in the class unless logstash is restarted before each pipeline...
UPDATE https://github.com/logstash-plugins/logstash-integration-jdbc/issues/121#issuecomment-2045396133
Using jdbc plugin 4.3.13 Using logstash in docker: docker.elastic.co/logstash/logstash:7.1.1
In theory clean_run setting of the jdbc plugin decides:
Again, in theory if we set clean_run: true, the sql_last_value should become '1970-01-01 00:00:00'
However, in practice, the _cleanrun: true in the example below still causes the pipeline to keep the last run state under the :sql_last_value parameter (the date of last run in our case).
:sql_last_value is set to '1970-01-01 00:00:00' only when the pipeline starts for the first time. 2nd and next runs don't start from '1970-01-01 00:00:00', the sql query still contains value of the last run.
Logs:
1st run is OK:
2nd run NOT ok:
3rd run and any later on NOT ok:
You can reproduce the behaviour with this input:
In order to start the pipeline from "0" ('1970-01-01 00:00:00'), we need to redeploy the logstash.
Could anyone explain this to me? Either I miss some understanding or the plugin doesn't work as expected - a bug. From my understanding the "clean_run: true" should cause the :sql_last_value param to be "zero" each time the pipeline starts, in the example above it keeps the state...