Closed stoyanr closed 4 years ago
@stoyanr: While I see the point of this change and I think your reasoning is completely valid I would like to hijack this PR and turn it into a discussion.
If nothing comes out of it I will merge it as a temporary solution to your problem.
A while ago we made a decision to turn users away from supplying their own set of filters via injecting a massive string into the deployment manifest. It gave us is a much cleaner release and deployment process at the expense of flexibility to configure the parsers.
Since there weren't many users of the BOSH release we added default filters that served our needs.
We are looking at various ideas to make the parsing rules easier to customise but we are not quite sure yet what route to take.
If you have some useful thoughts please share them with us, we are open to suggestions.
CC @mrdavidlaing, @hannayurkevich
CC @axelaris
Hi guys,
I think exclusion can be useful in some cases. But it is not flexible enough.
May be it makes sense to use some kind of filters chain. Lets allow user defining a chain of filters to be included into logstash.conf
in the specified order. So user can choose what default filters to include and also add some custom filtering.
It looks more flexible to me.
Btw, in the changes I noticed a typo:
<% if not p('logstash_parser.excluded_filters').include?
'filters_pre'
%>
cat ${JOB_DIR}/config/
filters_post.conf
>> ${JOB_DIR}/config/logstash.conf
<% end %>
So if you decide to merge the changes please correct it before merging.
Sorry about the typo, it's corrected now. Thanks for noticing!
Hi guys, I do not want to say that I don't like this PR at all. But I have a strong feeling that we will have more mess than we have at the moment. I'd prefer to develop one holistic solution instead of set of promiscuous patches.
In my vision, we should follow these rules:
As for implementation, I guess it should look as an array or hash, where we can have an Id for each element. Please feel free to share your thoughts.
Hi guys,
It's an interesting discussion. I thought how something similar could be achieved and I came up with 2 options. @axelaris, I am not addressing your points 1 or 2 above, as I believe these could be addressed later with a different PR, as far as we keep them in mind.
Option 1
Introduce 2 additional properties default_filters_pre
and default_filters_post
which contain lists of default filters that should be included before and after the custom filters, respectively. This is in fact close to my original PR, except that it's based on inclusion instead of exclusion, and it differentiates between pre and post filters. The spec would look something like:
properties:
logstash_parser.default_filters_pre:
default: [filters_pre.conf, logstash-filters-default.conf]
logstash_parser.default_filters_post:
default: [filters_post.conf, if_it_looks_like_json.conf, timecop.conf]
logstash_parser.filters:
default: ''
Our custom configuration would look something like:
properties:
logstash_parser:
default_filters_pre: [filters_pre.conf]
default_filters_post: [filters_post.conf]
filters:
- |
<%= File.read('templates/filters/filter-default.conf').gsub(/^/, ' ').strip %>
One advantage of this approach is that it's backward compatible with regards to existing configurations.
Option 2
In fact we already have the chain of filters that @hannayurkevich mentioned, it's the filters
property. It can already contain either a list of filter files, or a block of text. We could further generalize this property so that it contains the full list of filters, both default and custom. We would need however to introduce a "type" for each filter so that we could distinguish between at least 3 different cases:
config
)path
)text
)The spec would look something like:
properties:
logstash_parser.filters:
default:
- filter: filters_pre.conf
type: config
- filter: /var/vcap/packages/logsearch-config/logstash-filters-default.conf
type: path
- filter: filters_post.conf
type: config
- filter: /var/vcap/packages/logsearch-config/if_it_looks_like_json.conf
type: path
- filter: /var/vcap/packages/logsearch-config/timecop.conf
type: path
Our custom configuration would look something like:
properties:
logstash_parser:
filters:
- filter: filters_pre.conf
type: config
- filter: |
<%= File.read('templates/filters/filter-default.conf').gsub(/^/, ' ').strip %>
type: text
- filter: filters_post.conf
type: config
This approach has the advantage of being much more flexible, and the disadvantage that it's not backward compatible with regards to existing configurations. If backward compatibility is considered important, it could be achieved by using a name different from filters
for this property and making sure that the original filters
property still works as before.
What do you think?
I like @axelaris's perspective of configuring via inclusion and @stoyanr's option 2 idea. An alternative to requiring a type
property would be to treat the strings like curl/cli file arguments - when starting with @
it's a file reference, and when starting without an absolute forward slash they're a builtin logsearch config. Example...
properties:
logstash_parser:
filters:
# builtin pre
- '@filters_pre.conf'
# manifest-based filter
- <%= File.read('templates/filters/filter-default.conf').gsub(/^/, ' ').strip %>
# co-located filters
- '@/var/vcap/packages/acme-logsearch/production.conf'
# if globs were desirable
- '@/var/vcap/packages/*-logsearch/logstash-filters.conf'
# builtin post
- '@filters_post' # could probably assume .conf extension on builtins if desirable
Although, it looks like logstash_parser.filters
is now a hash or something whose key value isn't used at all which seems weird. So whatever the example actually needs to look like - the single-value @
thing was all I was trying to demonstrate :)
Indeed, "the single value @ thing" proposed by @dpb587 saves some typing. logstash_parser.filters
is then just an array of strings that are treated differently depending on whether they begin with @ or not.
Should I then correct the PR based on Option 2 with this correction, or are there other proposals?
One unrelated, but important question for us - when do you plan to release your next final release? We would like to get rid of our internal fork and after this PR is merged, we only need a final release that contains all our PRs in order to do it.
+1 for configuring via inclusion and @stoyanr's option 2 idea.
I prefer an explicit type param; rather than magic characters; especially since I expect further config options will arise in the future - the logstash roadmap has some form of centralised config on it; so I can see a future with additional config types.
Release 200 has already hopelessly broken backward compatibility; so I think now is a good time to do the same in the interests of a cleaner solution.
I'm not convinced we need a config type - couldn't this also be a path - just to files that that are inside the job rather than the package?
In terms of the final release - hopefully mid March.
@stoyanr - could you amend your PR to implement Option 2.
@cromega - could you review and merge please once Option 2 is done.
I amended the PR to implement Option 2 as originally proposed. I preserved the config
type for now. Yes, config
can also be path
, but using config
those configuration templates could be added by their names only and ${JOB_DIR}
could still be used inside parser_ctl
, as before. Please let me know if it fits your expectations.
Hi everyone,
I'm the Product Owner of @stoyanr's team and I'm interested in the status of this pull request. I find the discussion really interesting and the agreed solution quite good. What do you think? Would this change be part of the next final release? Please share your thoughts/comments.
Cheers, Martina
@stoyanr, @martinagalabova: Sorry for the delay, we are having lots of discussion and making plans about the future and direction of this repo.
We like this implementation. I would like to propose a minor change though. Rename config
to builtin
.
I could do it myself after merging it but I wanted you to be aware of it.
@stoyanr: As a second thought, it would be nice if you changed the yamls under templates/ accordingly. so by running the generate manifest script you get a deployment with a correctly configured logstash.conf. You can exclude the filters you think are unnecessary for a basic configuration.
@stoyanr - thank you for this submission - just noticed that we don't seem to have a signed Contributors License Agreement on record for you.
Please could you print & sign the relevant CLA from https://github.com/logsearch/licensing#signing-the-contributors-license-agreement and forward to david.laing@labs.cityindex.com
Thanks!
D
@cromega, I adapted the PR as you requested.
@mrdavidlaing, I contacted the appropriate people at SAP regarding the CLA. The corporate CLA should be signed by them within a week or so.
@stoyanr - thanks - sorry to have only picked up that we didn't have a CLA from you so late in the merge process :(
@stoyanr - any update on that CLA?
@mrdavidlaing I think our legal department are working on it. They should contact you, if they haven't done so already.
@stoyanr - Thanks for the update - although I haven't heard from your legal department yet. Please can you just double check with them that they are trying to communicate with me via david.laing@labs.cityindex.com (and not david.laing@cityindex.com; which no longer works)
@stoyanr Any update on your CLA? I still don't seem to have received anything from your legal department.
Thanks!
@mrdavidlaing I learned today that our legal counsel Chris Metcalfe had already contacted you. Could you please follow-up with him on that?
@stoyanr - I haven't had any contact with Chris Metcalfe - could you perhaps forward his email address to me at david.laing@lab.cityindex.com (or dlaing@pivotal.io) so I can follow up.
@mrdavidlaing I followed up via email, putting Chris in cc, and mentioned all your email addresses.
@stoyanr, Just FYI; I'm in contact with Chris; who says there is a problem with SAP signing the Logsearch CLA because of its copyright assignment clause.
I'm working with him to see if there is a way around this.
@mrdavidlaing @stoyanr any updates on the topic in the meanwhile?
@voelzmo, Sorry, no
We would like to be able to exclude some of the default logstash filters from the final
logstash.conf
, in order to be able to customize the parser rules completely. In particular, we had to modifylogstash-filters-default
heavily in order to meet the requirements of our stakeholders. We are also not sure yet if we need the 2 newly introduced default filters,if_it_looks_like_json
, andtimecop
. So far we didn't have any issues withfilters_pre
andfilters_post
, but for the sake of completeness, I included them in the exclusion mechanism as well.Having such a mechanism would also allow us to stay independent of future changes to the default filters introduced in this release when adopting newer versions of it, up until the point in time we actually need them.
The change introduces one new parser property
logstash_parser.excluded_filters
which is an array containing the names of default filters that should not be added to the finallogstash.conf
. This property is checked inparser_ctl
before actually adding each of the filters.