Closed breml closed 3 years ago
Thanks! I've looked at roughly half the PR and it looks really great. Obviously, with this amount of code I won't be able to scrutinize all the details and find obscure goroutine races etc.
I decided to put all the sub-commands related to the daemon mode below daemon. E.g. logstash-filter-verifier daemon start. The run/execute the test cases in daemon mode, the command is named logstash-filter-verifier daemon run.
Okay. Are you thinking stanealone run
for symmetry or just standalone
since there probably won't be any other standlone-related verbs?
Most of the new Go packages, that contain the "business" logic of the daemon, are located in internal/daemon in order to clearly separate them from the existing standalone code base.
That sounds like a good location.
I am not sure, how to add the dependency check for protobuf-compiler in the Makefile. The protobuf-compiler is required for go generate to work in order to generate the protobuf/grpc related code.
Added an inline comment about this.
Early on, I created some state and sequence diagrams to visualize, how the daemon mode is supposed to work. There have been quite some changes since then and therefore the diagrams are outdated. I might add them later, when things have further stabilized.
That's fine. It'd be good to have them once the dust settles but we don't have to keep them up to date during development.
The README.md is currently completely untouched (also the logstash-filter-verifier standalone part is missing)
Okay!
Thanks! I've looked at roughly half the PR and it looks really great. Obviously, with this amount of code I won't be able to scrutinize all the details and find obscure goroutine races etc.
Thanks for your review, this is very much appreciated. I agree, there are quite some go routines, channels, waitgroups, etc. involved. To find races by just reading the code is very hard. I have some ideas to make this part easier to understand and maintain. I will give it a try.
I decided to put all the sub-commands related to the daemon mode below daemon. E.g. logstash-filter-verifier daemon start. The run/execute the test cases in daemon mode, the command is named logstash-filter-verifier daemon run.
Okay. Are you thinking
stanealone run
for symmetry or juststandalone
since there probably won't be any other standlone-related verbs?
Me personally, I would stick to the standalone
without an additional run
, if you want to keep the rest of the cli api as close to the existing one as possible. It would be a different story, if we would refactor the existing cli api as well, for example by introducing a sub-command for the socket mode.
I fixed some more bugs and added functionality to support keep-envs
also in daemon mode. With the fixes done today, I was able to run our main test suite in daemon mode (> 50 test cases, Logstash Config with > 1400 lines). This is a nice milestone / achievement so far.
I have the latest changes in a separate branch (forked from the branch, this PR is created from: https://github.com/breml/logstash-filter-verifier/tree/add-daemon-mode-continue).
@magnusbaeck Do you prefer if I update this PR with my latest changes or do you prefer separate PR?
I will now try to compile a list of the missing feature from our initial issues to figure out, where to continue.
I fixed some more bugs and added functionality to support keep-envs also in daemon mode. With the fixes done today, I was able to run our main test suite in daemon mode (> 50 test cases, Logstash Config with > 1400 lines). This is a nice milestone / achievement so far.
Very cool! Our Logstash configuration is similar in size. I'll try it with your latest code.
Do you prefer if I update this PR with my latest changes or do you prefer separate PR?
Since Github doesn't support dependencies between PRs I'm not sure there's a point in creating a separate PR. I'll do my best to look at the rest of this PR in the next few days.
@magnusbaeck Just some quick remarks regarding my test:
codec
setting. This feature will be ignored by the version 2.0 (the idea is to inherit the codec from the input that is replaced, so there will be a replacement).exportMetadata: true
for every test case suite file. This can be done with the following snipped:for f in *.json ; do
jq '. += {"exportMetadata": true}' $f > $f.migrated && mv $f.migrated $f
done
I hope, these remarks are of help, if it does not work out of the box with your Logstash config.
One of the above comments was not completely correct, I only used the filter part of the configuration, but I added the following dummy input/output:
input {
stdin {
id => "stdin"
}
}
output {
stdout {
id => "stdout"
}
}
The code looks really great! I just had a few comments and questions.
I tried running it locally (from commit 2b78b2b), but I'm unsure what to expect from
daemon start
. Is that command supposed to block? I can dig out the Logstash log from the tempdir and Logstash starts up fin but the LFV command hangs. Backgrounding the process and runningdaemon run
eventually results in a panic:I'm not sure what to expect so I haven't debugged this any further.
Yes, daemon start
is starting the daemon in the foreground, so this command does sort of "block". You need to open a second window / terminal to execute daemon run ...
.
My logstash-filter-verifier.yml
:
---
loglevel: debug
logstash:
path: ./3rdparty/logstash-7.10.0/bin/logstash
keep-envs:
- TESTMODE
metadata-key: __metadata
So currently test only with logstash 7.10.0. With other versions, there might be unknown issues.
After successful startup, you should see the following output (I have log level on debug in my config):
config: socket: /tmp/logstash-filter-verifier.sock
config: logstash-path: ./3rdparty/logstash-7.10.0/bin/logstash
Temporary directory for daemon created in "/tmp/lfv-407717077"
state change: "created" -> "started" by "start"
start stdout scanner
start stderr scanner
Daemon listening on /tmp/logstash-filter-verifier.sock
Waiting for /tmp/lfv-407717077/logstash-instance/FWm0J0Bl/logstash.log to appear...
stdout: Using bundled JDK: /home/lubr/projects/mustache/logstash-filter-verifier/3rdparty/logstash-7.10.0/jdk
stderr: OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
stderr: WARNING: An illegal reflective access operation has occurred
stderr: WARNING: Illegal reflective access by org.jruby.ext.openssl.SecurityHelper (file:/tmp/jruby-355527/jruby11114804341739218028jopenssl.jar) to field java.security.MessageDigest.provider
stderr: WARNING: Please consider reporting this to the maintainers of org.jruby.ext.openssl.SecurityHelper
stderr: WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
stderr: WARNING: All illegal access operations will be denied in a future release
stdout: Sending Logstash logs to /tmp/lfv-407717077/logstash-instance/FWm0J0Bl which is now configured via log4j2.properties
taillog: -> pipeline started: stdin
taillog: -> pipeline started: output
Ready to process tests
state change: "started" -> "ready" by "pipeline-ready"
taillog: -> pipeline running: [output stdin]
state change: "ready" -> "ready" by "pipeline-ready"
Especially the state should change to ready
. Then the daemon is ready to execute some tests.
I recommend you to start with the examples I provide in the repo, e.g. logstash-filter-verifier daemon run -p testdata/basic_pipeline.yml --pipeline-base testdata/basic_pipeline --testcase-dir testdata/testcases/basic_pipeline
.
If these tests are successful, you may move on to your own config.
When the daemon does timeout or even crash, then there might be a problem with the loading of the logstash config. The logstash log file might give you some hints.
What's the long-term plan here? Automatically add dummy inputs and outputs if none are present or detect the problem and alert the user?
@magnusbaeck That is a valid question. If I remember correctly, we had a similar question some weeks ago. Raise an error to the user, if there are no inputs or outputs, should be a pretty straight forward thing to add. Automatically adding inputs and outputs only works, if the user has a single pipeline (or multiple pipelines, which are connected serially, which is then not different to a single pipeline). If this is not the case, it is not possible to figure out, where to add the inputs and outputs. Therefore I prefer to just warn the user, if there are no inputs or outputs.
Raise an error to the user, if there are no inputs or outputs, should be a pretty straight forward thing to add.
I added the check for inputs and outputs in https://github.com/breml/logstash-filter-verifier/commit/77de4a67e82ae4ba339879b9a9d5245639908b05.
I will try to finish my work on the codec support in the next few days and I will then create a new PR for you to review.
@magnusbaeck I reached a state, where I felt, I should create a first PR for my work on the daemon mode in order to start discussion about the code and package structure. I also added some unit tests as well as some integration tests (some packages have already pretty decent coverage).
Be aware, this PR does include a lot changes, ~ 5000 new lines of code.
There are still a lot of open topics (some listed as
TODO:
orFIXME:
comments, some in my personal notes).Some specific open issues:
daemon
. E.g.logstash-filter-verifier daemon start
. The run/execute the test cases in daemon mode, the command is namedlogstash-filter-verifier daemon run
.internal/daemon
in order to clearly separate them from the existing standalone code base.protobuf-compiler
in theMakefile
. Theprotobuf-compiler
is required forgo generate
to work in order to generate the protobuf/grpc related code.README.md
is currently completely untouched (also thelogstash-filter-verifier standalone
part is missing)I am looking forward to your feedback.
Related issues: #93, #94, #96