Open terrancedejesus opened 1 year ago
We have some considerations with adding this. These fields are currently in the rule meta since they do not matter for the SHA256 hash calculation. As a result, typically anything we add to the hash calculation should be moved into the rule data itself and have validation done on the values.
Options:
creation_date
and updated_date
to the API formatted rule object, after it is built and hash has been calculated.creation_date
and updated_date
to strip_additional_fields
and remove them during hash calculation.creation_date
and updated_date
into rule.contents.data
and remove from rule.contents.meta
.Either way we need to add validation to these field value pairs to keep the date values consistent.
@Mikaayenson @eric-forte-elastic @brokensound77 - Any additional thoughts?
I do not see an issue with this approach/solution :+1:
Just as a note, we will need to update unit tests to also have creation_data
and updated_date
in the rule.contents.data
and update the following line from packaging.py
def _package_kibana_index_file(self, save_dir):
"""Convert and save index file with package."""
sorted_rules = sorted(self.rules, key=lambda k: (k.contents.metadata.creation_date, os.path.basename(k.path)))
None of these should be an issue as the functions/tests have access to the contents
object of a rule allowing them access to both metadata and data.
@terrancedejesus Can you explain why we want to make the change to move these fields at all (or even add to the build)? Was it requested upstream?
@terrancedejesus Can you explain why we want to make the change to move these fields at all (or even add to the build)? Was it requested upstream?
Requested upstream from @jpdjere for UI regarding rule update workflow.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@jpdjere is this still a request from your team? If so, I'd like to get it correctly scoped for one of our upcoming sprints. Thank you!
Hi @terrancedejesus , yes this is still a valid request. We won't be working on anything that needs this data for 8.11 but probably 8.12
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi @terrancedejesus . Do you have bandwidth for this in any upcoming releases?
@jpdjere - Thanks for the follow up! I added this to our teams next sprint cycle which starts Nov 27. With recent adjustments to our current sprint cycle, I will attempt to get started with this to determine if it is relatively straight forward and if so will have it in earlier.
Hello @jpdjere ๐ ,
Can you provide more context for this request? We are just trying to understand the reasoning and whether this is the best representation of this information.
Right now, since we do not push this with the rules, the dates are pulled from kibana:
This is reflective of the rules as they apply to a users stack, which seems accurate and informative.
Our dev cycle creates situations where rules may not be released for a few days or weeks after modification, so there is inconsistency that may cause confusion. More so, I think it may be more valuable understanding when a rule was created and modified within a stack vs when it was developed.
These fields under our metadata are currently used as a means to inform us on changes from a maintenance perspective.
Thoughts?
Hi @brokensound77 . Thanks for the follow up.
The idea behind this request is to give the users an idea of how "recent" the updates to a specific rule are, in order to know how long those specific updates for the rule have been pending. Here's a screenshot of the UI as proposed by our designers: By sorting by the "Last updated" column, the user could have a quick understanding of which rules have been recently updated in the Fleet package and know which rule updates have been pending for a long time, i.e. should take their most immediate attention.
This becomes especially important when the user has neglected addressing updates from one (or many) package releases, and after some considerable amount of time sees in our Rule Update table a list of rule updates corresponding to more than one releases.
For example, a user seeing this table in October could see listed 4 rules that have updates coming from a package release made in March (so their updated_by
timestamps would maybe be around January, February, March), and 5 more rules that have updates coming from a package released in August (and updated_by
timestamps would maybe be around May, June, July). If a rule had updates in both releases, the latest would be displayed, since we always compare to the latest version.
(Sorry if this timelines don't make sense, I don't have in my mind right now what is your release cadence).
Having said that:
Our dev cycle creates situations where rules may not be released for a few days or weeks after modification, so there is inconsistency that may cause confusion.
I think that's a valid concern that can cause confusion to users, given the false impression that updates have been pending for a long time, when the rule updates have just been released. A couple questions:
updated_at
timestamp that we would see?(This second question, I think, is not that important considering that the user might accumulate updates from many subsequent releases, but it's good to know).
@jpdjere - Apologies for the questions going unanswered.
How usual is this discrepancy between the actual update date of a rule on your side and the release of a package?
We release OOB updates bi-weekly. Therefore, updated dated discrepancy could be 1-14 days apart as it would depend on the pull request merge and when the package reaches EPR. This could be expanded more if the release takes longer than expected, but is a rare occurrence.
Do you have an approximate idea of how large can the date range be for rule updates within one package release? What I mean is: if a package release includes 10 updates, what is the earliest and the latest updated_at timestamp that we would see?
Any time a rule is updated, the updated date value is updated. Therefore it could be any date between when the last package was available in EPR to the date when the next package is available in EPR. Again, we release bi-weekly so there is an approximate range of ~14 different dates that could apply.
We are moving forward with this as it is a requirement upstream for customizing prebuilt detection rules, milestone 3. Below are considerations:
BaseRuleData
?@jpdjere or @banderror - Can you provide insight to the following for us. Thank you in advance!
created_date
and updated_date
or are the keys named differently.Option 1 - In this option, we rely on post_dict_conversion
to add a new method _convert_add_date_fields()
. This method takes the rule metadata and assigns it to the same keys, except inside of the obj
dictionary which is already converted to Kibana API format from to_api_format()
. We also add a validate_date_format()
method with validates_schema
marshmallow decorator. This will ensure that the date formats are ISO 8601, if we choose to follow this standard. Notes below:
BaseRuleData
. Thus these fields would need to be part of the metadata in the export as they typically are. However, I noticed that there are several fields in the first array object of an exported rule that differ from the rule objects we ship.Commit Reference: https://github.com/elastic/detection-rules/commit/3bc8df6e8db0da0eab483ab26633e391fab18219
Option 2 - We only add meta
to the package release files. Prebuilt rule packaging and artifact building rely mainly on to_api_format()
method in rule.py
. We already have an option include_metadata
that is by default False
. If True
, it will add the RuleMeta
as a dictionary to the rule object that will become the rule asset in the prebuilt rules package. Therefore we can avoid altering any data schema's, backport concerns, etc. Instead upstream on the Kibana side, they would handle accessing whatever metadata is shipped with the rule. This also allows us to avoid version bumps as well since we are not altering the rule contents, yet adding the metadata to the dropped files instead.
Commit Reference: https://github.com/elastic/detection-rules/commit/0402dc2ea99f187b3a842a08d222d7a30f4a164a
Option 3 - We move the date fields from RuleMeta
and include them in BaseRuleData
as requirements. We then need to adjust every rule back to 8.3 branch, automatically through backporting and manually. The outcome is that date fields will now be in the rule contents of the API formatted rule and be available explicitly upstream by Kibana. Dates would then affect the rule version as well as any changes to rule contents marks the rule as dirty. This option has the biggest amount of changes and correction across branches that would need to be addressed.
Option 4 - Similar to option 2, only instead of using metadata, is there any reason why we can't use the date of the release (almost like a build time field)? From the description, it doesn't seem necessary to have the exact date the rule was modified by our rule authors.
Option 4 - Similar to option 2, only instead of using metadata, is there any reason why we can't use the date of the release (almost like a build time field)? From the description, it doesn't seem necessary to have the exact date the rule was modified by our rule authors.
@Mikaayenson Great alternative with a few caveats. Our source-of-truth is typically the repository since that is where we lock versions. Let's say we lock versions and a rule has changes that cause the SHA256 to change. This "state" of the rule is only noticed during the lock versions, which we also release our commits from. Technically, up until this version lock, our rule could go through several changes and updates, but it is only when we lock versions do we track the current state of the rule. The last updated_date
would be inline with this SHA256 change as the exact date when the version change was noticed.
We also have to take into consideration release timing. Releases could take 1-2 days, thus the potential for a divergence of dynamic dates based on building the package could occur not only from the version lock, but also between each package. All packages would have to be released GA on the same day for them to accurately reflect the same updated date.
Option 4 - Similar to option 2, only instead of using metadata, is there any reason why we can't use the date of the release (almost like a build time field)? From the description, it doesn't seem necessary to have the exact date the rule was modified by our rule authors.
I concur. If approach 4 is not a heavy lift, I would prefer this as well. While I do not think this is an immediate concern, I think there may be a case where we would not want to include all of the metadata in the release.
Option 4 - Similar to option 2, only instead of using metadata, is there any reason why we can't use the date of the release (almost like a build time field)? From the description, it doesn't seem necessary to have the exact date the rule was modified by our rule authors.
@Mikaayenson Great alternative with a few caveats. Our source-of-truth is typically the repository since that is where we lock versions. Let's say we lock versions and a rule has changes that cause the SHA256 to change. This "state" of the rule is only noticed during the lock versions, which we also release our commits from. Technically, up until this version lock, our rule could go through several changes and updates, but it is only when we lock versions do we track the current state of the rule. The last
updated_date
would be inline with this SHA256 change as the exact date when the version change was noticed.We also have to take into consideration release timing. Releases could take 1-2 days, thus the potential for a divergence of dynamic dates based on building the package could occur not only from the version lock, but also between each package. All packages would have to be released GA on the same day for them to accurately reflect the same updated date.
Not sure if it would be adding too much overhead, but could we add a release tag when we lock versions and then pull the tagged SHAs of the rules and compare that way?
That being said, I also like option 2 and I do not see any immediate issues with it :+1:
so have to take into consideration release timing. R
IINM the date is to provide people with a general timeline:
By sorting by the "Last updated" column, the user could have a quick understanding of which rules have been recently updated in the Fleet package and know which rule updates have been pending for a long time, i.e. should take their most immediate attention.
So im not sure if we need it to align with rule updates or locked versions etc. It sounded like they just need to know when the package was last updated.
@eric-forte-elastic brought up another good idea in a slack thread about using release tags for the date information which is another interesting idea. Recording the idea here for posterity.
@jpdjere or @banderror - Can you provide insight to the following for us. Thank you in advance!
@terrancedejesus, thanks for checking this with us. @jpdjere can keep me honest, but:
- What is the target minor release for this and will it backport to previous versions?
No concrete target minor release is defined at the moment. The Milestone 3 for customizing prebuilt rules is currently at the stage of technical design, the development hasn't started yet. I think you guys can safely expect 2 release cycles from now until we can release anything. It could be more.
When rule customization is ready for release in Kibana, we will not backport this to prior minor versions. We don't backport new features in general.
The earliest version to which we could backport these new fields in the package would be 8.12
, because only in 8.12.0
we made prebuilt rule schema forward compatible with new package updates on our side (https://github.com/elastic/security-team/issues/6888).
- Is there any specific date format that is required?
The standard one: ISO 8601 date-time string in UTC. Example:
2023-01-29T14:48:00.000Z
- Is this only for
created_date
andupdated_date
or are the keys named differently.
Not sure I understand the question.
Naming of the keys hasn't been defined yet on our side and they are not in the schema yet. @jpdjere Could you please create a ticket for us, describe the requirements, and prioritize work on it?
@banderror thank you for taking the time to provide details.
The standard one: ISO 8601 date-time string in UTC. Example:
Our dates are typically YYYY\MM\DD
. We can update all rules to use YYYY-MM-DD
as it shouldn't affect their version or SHA256 since it is metadata.
Not sure I understand the question.
Right now we have creation_date
and updated_date
as the key names in each rule metadata. Do these need to be adjusted to match your key names upstream? We would prefer they stay the same downstream in our repository to reduce mass changes across different branches when we backport.
As you may have seen by TRaDe's discussion, the simplest approach for this would be to include the metadata of the rule as a root key in the JSON rule object. Thus any metadata from the rules can be accessed and used as your team pleases. Are there any objections to this or other preferences?
creation_date
- Located in metadata of rule. Describes when the rule was first merged into main of our repository.
updated_date
- Located in metadata of rule. Describes the date of the latest update based on changes merged into main of our repository.
availability_date
- This is not captured anywhere but would track when the rule update was made available (EPR GA) via typical Elastic rule update workflow. This could be deterministic on Fleet pulls from EPR. Not sure if we want to pursue this but it has come up in previous discussions and amongst TRaDE as we determine what we are attempting to convey to customers with the rule dates.
This is an example of the would be rule object:
@terrancedejesus Thanks for following up on this.
As you may have seen by TRaDe's discussion, the simplest approach for this would be to include the metadata of the rule as a root key in the JSON rule object. Thus any metadata from the rules can be accessed and used as your team pleases. Are there any objections to this or other preferences?
We assessed you proposal of adding the meta
property to the rule object, and -since we only need the updated_date
right now, we would strongly prefer not to "pollute" the object with all the other data included in the meta
property. We currently have no use for any other metadata - and probably won't have in the future. Is adding this meta
property something that you need for internal tooling/processes on your side? Otherwise, we would prefer to simply add an update_date
field as a top level field.
Right now we have creation_date and updated_date as the key names in each rule metadata. Do these need to be adjusted to match your key names upstream?
updated_date
is OK for our use. We also have similar updated_at
keys in our internal rule object schemas but that has a different semantic meaning and want to avoid collisions in key naming.
creation_date - Located in metadata of rule. Describes when the rule was first merged into main of our repository. updated_date - Located in metadata of rule. Describes the date of the latest update based on changes merged into main of our repository. availability_date - This is not captured anywhere [...]
Based on your explanation of the biweekly releases, we decided that we don't need a precise date for the update, but a "ballpark" date. If the difference between updated_date
and availability_date
is maximum 14 days, we have no issues in using whatever value is easier to calculate or pass down to the rule object.
Our dates are typically YYYY\MM\DD. We can update all rules to use YYYY-MM-DD as it shouldn't affect their version or SHA256 since it is metadata.
We would strongly prefer if you could format the date of the top-level update_date
field in ISO 8601 date-time string in UTC, such as: 2023-01-29T14:48:00.000Z
. Since Kibana parses dates based on the server's locale, we cannot guarantee that 2021\01\02
will always be parsed as January 2nd, 2021 and not February 1st, 2021 in some other locale. Is this possible on your side?
Also, I'll create a ticket on the Kibana repo for adding this task and link it here.
During the simplified protections sync, Juan mentioned they only need a general availability date, ideally called elastic_last_updated
(@jpdjere please let us know if I recalled this name incorrectly). This would be similar to a build time field and placed at the root of the object. The two week window if that is how often we release is general enough for them.
Thanks for the follow-up here @Mikaayenson
Yes, I was originally going for update_date
but I think with elastic_last_updated
we can have a more clear meaning that this field only refers to updates done by the Elastic team and not by the user, and is thus only valid for Prebuilt Elastic rules. Also, it avoids collisions and confusion with other similar fields that we have in out internal rule objects.
So ๐ from my side for naming the property like that, and being it at the root level of the rule object, and with a ISO 8601 format date.
@Mikaayenson @jpdjere - Thanks for the insight and update.
Still some lingering questions that are unclear to me:
@terrancedejesus
Are we no longer wanting a date that represents when the rule was created?
No. As long as the update date matches the creation date when the rule is created, we don't need a separate field for the creation date, as we never need to show the creation date specifically to the user. What I mean is:
// Rule is created
creation_date: 2024-05-21
update_date: 2024-05-21
// Rule is updated the first time
creation_date: 2024-05-21
update_date: 2024-08-12
// Rule is updated the second time
creation_date: 2024-05-21
update_date: 2024-12-26
... and so on
If that's the behaviour of the update_date, then we only need that update_date.
If we are targeting an "availability" date, is this logic not possible by Kibana when the rule update is first identified when Fleet pulls the package from EPR? The "availability" date of a rule update or new rule, if not represented by the metadata, is only determined based on when package is released.
Prebuilt rule assets are installed to the Kibana kibana_security_solution
index by Fleet's API; we don't have any additional logic for this installation that we could modify to track when a new rule is first identified. The prebuilt rule assets documents that are indexed into that index by Fleet do have a updated_at
and created_at
field, but both of them are always the date in which the latest installation or update happened, for all versions of a rule.
This is why we need the elastic_update_date
to be part of the rule's data.
If we go down the route of "availability" date, then this date will be redundant across any rule that is new or updated. Is this what we are attempting to achieve.
Not sure I completely understand this point, but I think my previous answer addresses this, we still need that update_date or availability_data as parte of the prebuilt rule asset data.
If we go down the route of "availability" date, then this date will be redundant across any rule that is new or updated. Is this what we are attempting to achieve.
Not sure I completely understand this point, but I think my previous answer addresses this, we still need that update_date or availability_data as parte of the prebuilt rule asset data.
The main point is that every rule object updated will have the same elastic_update_date
since we will be using the date the package was built. If the package has 10 rules updated, all ten rules will have the same date.
It's not necessarily a problem, just a note of redundant-looking information across all updated rules. Now if we have another package later with another different 10 rules updates, of course the date will be different from the first 10.
Thanks for the explanation.
Now if we have another package later with another different 10 rules updates, of course the date will be different from the first 10.
This is good enough for us. We don't strictly need different dates within one package release - within that 14 day window. As long as we can distinguish between updates coming from different packages releases we are fine.
@jpdjere - Thank you for the deets.
I have began adding this into the rule asset creation. One thing I noticed is that when we add historical rules to the package, these will not include an elastic_update_date
as these assets are pulled from EPR and are before this addition. Thoughts on this?
Also, below is an example of the rule asset with elastic_update_date
, does this work? FYI, we are pulling this date from the rule metadata updated_date
- just want to confirm this is fine.
Also we added elastic_updated_date
to the root of the rule asset. The other keys here are id
and type
. - Want to confirm this is fine as well.
@terrancedejesus Sorry for the delay in replying.
I have began adding this into the rule asset creation. One thing I noticed is that when we add historical rules to the package, these will not include an elastic_update_date as these assets are pulled from EPR and are before this addition. Thoughts on this?
Yes, we understand this will be the case. We will add elastic_update_date
as an optional field within our Prebuilt rule asset schema and our internal rule schema to accommodate for the fact that some rule will have this info and others won't.
Also, below is an example of the rule asset with elastic_update_date, does this work? FYI, we are pulling this date from the rule metadata updated_date - just want to confirm this is fine.
elastic_update_date
pulled from updated_date
works for us ๐
Also we added elastic_updated_date to the root of the rule asset. The other keys here are id and type. - Want to confirm this is fine as well.
We strongly prefer to have the elastic_updated_date
within the attributes
field. Our Prebuilt Rule Assets Client pulls only data living within this subfield (renamed security-rule
when the prebuilt rule is installed as a prebuilt rule asset in Elasticsearch), and discards the id
and type
- and all other data in the "root" level.
Would this be fine by you as well? Or would it have side-effects in hashing, etc? Sorry for the confusion in the discussion above, where we talked about "root-level".
@jpdjere - Thanks for the reply!
We strongly prefer to have the elastic_updated_date within the attributes field. Our Prebuilt Rule Assets Client pulls only data living within this subfield (renamed security-rule when the prebuilt rule is installed as a prebuilt rule asset in Elasticsearch), and discards the id and type - and all other data in the "root" level.
No problem with us, easy to adjust and thank you for clarification.
Here would be an updated rule asset, does this work?
@terrancedejesus
Great, thanks a lot! ๐
Yes, that looks good. Just a nit - wanted to make sure that the date format is ISO 8601 with the UTC format; the example above is missing the miliseconds and the Z at the end: 2019-11-14T00:55:31.820Z
. That's how the date are currently formatted for the created_at
and updated_at
properties in the security-rule
assets:
@jpdjere - Thanks for responding.
the example above is missing the miliseconds and the Z at the end
The milliseconds and Z
I can add these. Note that our updated_date
in the rule metadata is not ISO-8601 formatted, so no time is captured and it will remain as 00
for these.
Since we are adding elastic_update_date
to the attributes, is this now a required rule field that was added to the rule schema upstream? We ask because at the moment, our PoC for this loads the TOML file to JSON, then loads it as an object through our rule schema, which is how we validate the rule is valid. Only after this, do we add the elastic_update_date
field when the rule is then converted to a rule asset to avoid version control, backports, breaking changes, etc. - Do we know if a rule is exported this field is exported as well in the rule?
From your image, it looks like the dates are separate from the actual security rule, therefore we are simply providing a way for you to retrieve this date when assets are shipped?
@brokensound77 - Am I missing the point here or questions we discussed?
The milliseconds and Z I can add these. Note that our updated_date in the rule metadata is not ISO-8601 formatted, so no time is captured and it will remain as 00 for these.
That's OK, the information about year, month and day is enough for us. So dates that look like 2019-11-14T00:00:00.000Z
are OK.
Since we are adding elastic_update_date to the attributes, is this now a required rule field that was added to the rule schema upstream?
We will be adding the elastic_update_date
as an optional field within the optional prebuilt
field for our rule schema. This will be part of the Prebuilt Rule Customization Epic - Milestone 3 we discussed in yesterday's meeting.
Do we know if a rule is exported this field is exported as well in the rule?
Yes, it will, as part of the prebuilt
object field.
From your image, it looks like the dates are separate from the actual security rule, therefore we are simply providing a way for you to retrieve this date when assets are shipped?
Those dates you see in the image above not the update_at
and created_at
dates from the metadata in the detection-rules
package. They are dates that are added to the Elasticsearch savedobjects when the Fleet API is called to install the security_detection_engine
package with the prebuilt rules, and are always set to the current date when the API is called, which is not useful information for us.
That's why the elastic_update_date
should be part of the rule attributes themselves.
@jpdjere - Thank you for providing additional insights.
We will be adding the elastic_update_date as an optional field within the optional prebuilt field for our rule schema. This will be part of the Prebuilt Rule Customization Epic - Milestone 3 we discussed in yesterday's meeting.
With this being said, if it is part of the rule schema, required or not, it is a breaking change for us because of backporting. We will need to change our approach and add this to our rule schema, rather than dynamically populate and push into the rule asset.
@brokensound77 - With this field being optional, I think it would be best to be a build time field, determined from rule metadata that we can only build for the compatible semantic version of the stack the feature is being added to. Regarding backporting, this will cause ALL of our rules to receive version bumps, for each release package. We have done this before, so I can get started on our strategy to implement this. Before I do, any additional thoughts?
Hey @jpdjere @terrancedejesus @Mikaayenson ๐
So there were lots of comments in this thread, and I'd like to double-check that after all these comments we're on the same page. Let me try to reiterate on our agreements and please correct me or add anything.
We're going to add a new optional field elastic_update_date
to security-rule
assets we ship via the package. Here's an example of this field for the Linux Restricted Shell Breakout via Linux Binary(s)
prebuilt rule:
The latest 111 version of this rule looks like this in the package:
{
"type": "security-rule",
"id": "52376a86-ee86-4967-97ae-1a05f55816f0",
"attributes": {
"rule_id": "52376a86-ee86-4967-97ae-1a05f55816f0",
"name": "Linux Restricted Shell Breakout via Linux Binary(s)",
"description": "Identifies the abuse of a Linux binary to break out of a restricted shell or environment by spawning an interactive system shell. The activity of spawning a shell from a binary is not common behavior for a user or system administrator, and may indicate an attempt to evade detection, increase capabilities or enhance the stability of an adversary.",
"type": "eql",
"language": "eql",
"index": ["logs-endpoint.events.*"],
// other rule fields...
"version": 111
}
}
The next 112 version should look like that:
{
"type": "security-rule",
"id": "52376a86-ee86-4967-97ae-1a05f55816f0",
"attributes": {
"rule_id": "52376a86-ee86-4967-97ae-1a05f55816f0",
"name": "Linux Restricted Shell Breakout via Linux Binary(s)",
"description": "Identifies the abuse of a Linux binary to break out of a restricted shell or environment by spawning an interactive system shell. The activity of spawning a shell from a binary is not common behavior for a user or system administrator, and may indicate an attempt to evade detection, increase capabilities or enhance the stability of an adversary.",
"type": "eql",
"language": "eql",
"index": ["logs-endpoint.events.*"],
// other rule fields...
"version": 112,
"elastic_update_date": "2024-01-29T00:00:00.000Z"
}
}
This field will be optional in our rule asset schema in Kibana. The field should be specified for all latest versions of all rules in the next version of the package for Kibana 8.13. The field can be omitted for all existing historical (previous) versions of rules as of today, but should be specified for all historical rule versions created after today in the future. For example, for the Linux Restricted Shell Breakout via Linux Binary(s)
rule above, all rule versions >= 112 should include the elastic_update_date
field.
The field's value must be formatted in the standard ISO format. Time of the day is not required and can be set to T00:00:00.000Z
. We don't have strong requirements for the accuracy of the date itself. It can be the date of file modification by a rule author, the date of PR merge, or the date of building the package. The only requirement is that the values must be monotonically increasing and give a rough understanding to the user when Elastic shipped an update to the rule. +/- a few days would be sufficient accuracy for us.
The new field must not be backported to any packages compatible with Kibana 8.11.x
and below. It can be backported to packages that are only compatible with Kibana 8.12.0
and above because starting from 8.12.0
we have forward compatibility of the rule asset schema in Kibana: https://github.com/elastic/security-team/issues/6888. This means that in 8.12.x
Kibana versions the elastic_update_date
, if specified in the package, will be ignored/omitted until we add support for it. In Kibana versions 8.11.x
and below the elastic_update_date
, if specified in the package, will lead to an error during prebuilt rule installation or upgrade.
Hey @terrancedejesus @Mikaayenson, last ask from our side: let's please change the name of the field to source_updated_at
to make it a little bit more future-proof.
After chatting with @jpdjere we figured we want the name to be resilient to hypothetical future capabilities in Kibana, such as user- or community-created packages with security-rule
assets distributed via private/user EPRs or the centralized EPR of Elastic if we ever have support for community-created content.
Tickets for the Rule Management team:
Alright so I did a bit of digging.
rule.py
and definitions.py
, the rules build successfully where if >=8.12 branch, the field is added accordingly, but if any lower the field is not addedBaseRuleData
. So dynamically the optional field will only generate on the respective compatible branches when packages are built. This gets me to the real problem and that is how we backport and version lock. As I attempt to showcase in the image below whenever we have a new field that is applied to all rules, optional or not, our versioning strategy does not do a good job of supporting this because the version is checked per backport branch where the SHA256 hashes are calculated. If these are different, then the version bumps +1. The important part to understand is that, in this example, in 8.11 a rule will not have elastic_update_date
dynamically generated with version X. The version lock workflow will then checkout 8.12 and do the same workflow, but now the rule will have elastic_update_date
and the SHA256 will change, bumping the rule version. The next time we lock versions it will bump twice as the state of the rule will always be different within (8.3-8.11) vs (8.12+).
The only option at this time would be to min-stack ALL rules to 8.12 so any updates, tunings, new rules would only go back to 8.12 stacks which is out of sync with our current supported stacks current-3, therefore this is a breaking change as @brokensound77 has stated. While we have introduced breaking changes before regarding this, it seems like a lot of breaking for a timestamp we can supply in metadata when shipping the rule asset to avoid breaking our backporting and versioning.
@Mikaayenson DED has an epic or meta somewhere for refactoring Detection Rules. May be worth exploring the schema for version lock file(s) in Detection Rules. I believe there is some resilience that can be added with a couple of options:
Remember that when we build a package per stack version, we build it from that branch specifically so we could align that with its own state of the rules for that branch somehow.
"4d4c35f4-414e-4d0c-bb7e-6db7c80a6957": {
"8.12" : {
"min_stack_version": "8.3",
"rule_name": "Kernel Load or Unload via Kexec Detected",
"sha256": "53f533ffdd9d2d9f7c1a5cba374de00d7db74d814cde9706d3750390086f3c78",
"type": "eql",
"version": 5
},
"8.11" : {
"min_stack_version": "8.3",
"rule_name": "Kernel Load or Unload via Kexec Detected",
"sha256": "53f533ffdd9d2d9f7c1a5cba374de00d7db74d814cde9706d3750390086f3c78",
"type": "eql",
"version": 5
},
"8.10" : {
"min_stack_version": "8.3",
"rule_name": "Kernel Load or Unload via Kexec Detected",
"sha256": "53f533ffdd9d2d9f7c1a5cba374de00d7db74d814cde9706d3750390086f3c78",
"type": "eql",
"version": 5
},
"8.9" : {
"min_stack_version": "8.3",
"rule_name": "Kernel Load or Unload via Kexec Detected",
"sha256": "53f533ffdd9d2d9f7c1a5cba374de00d7db74d814cde9706d3750390086f3c78",
"type": "eql",
"version": 5
},
"8.8" : {
"min_stack_version": "8.3",
"rule_name": "Kernel Load or Unload via Kexec Detected",
"sha256": "53f533ffdd9d2d9f7c1a5cba374de00d7db74d814cde9706d3750390086f3c78",
"type": "eql",
"version": 5
},
"8.7" : {
"min_stack_version": "8.3",
"rule_name": "Kernel Load or Unload via Kexec Detected",
"sha256": "53f533ffdd9d2d9f7c1a5cba374de00d7db74d814cde9706d3750390086f3c78",
"type": "eql",
"version": 5
},
"8.6" : {
"min_stack_version": "8.3",
"rule_name": "Kernel Load or Unload via Kexec Detected",
"sha256": "53f533ffdd9d2d9f7c1a5cba374de00d7db74d814cde9706d3750390086f3c78",
"type": "eql",
"version": 5
@terrancedejesus I'm thinking about how to simplify this for you. Can we do this:
self._convert_add_elastic_last_update_date(obj)
only for releasing. After discussion with @Mikaayenson...there were a couple options we wanted to explore to hopefully get this in on our end to not be a blocker for @banderror 's team.
The final proposal, as shown in the pull request, is to do the following and address each concern:
source_updated_at
field to BaseRuleData
to match the schema upstream but maintain typical rigor with schemas and our dataclassessource_updated_at
is a valid optional rule fieldsource_updated_at
field is pulled from the rule metadata and converted to the recommended ISO 8601 formatsource_updated_at
from the JSONsource_updated_at
into considerationdefinitions.py
that ensures the source_updated_at
string is the correct ISO 8601 formatSKIP_FIELDS_FOR_SHA256
variable to definitions.py
that equates to an array of strings. These strings are then used to remove their respective key:value pairs from the hashed rule content before calculating - moving forward we can use this for other fieldsNOTE I want to emphasize that we should not always revert to adding new build time fields here. For instance, related_integrations
and required_fields
have implications upstream that we cannot control and need to include these in versioning for potential breaking changes. Thus while it is an option, it does not suggest it is a go to solution moving forward.
TestBuildTimeFields
unit test class_post_dict_conversion()
to called while loading and building a rule, we have steps in here to call each respective method for the build time fields to add them into the rule asset (dictionary). In these methods we call check_restricted_field_versions()
which checks if the build time field compatible stack version is compatible with the current stack version referenced in packages.yml
. If compatible, proceed to add the build time field, if not don't add.source_updated_at
is not taken into consideration for SHA256 calculations with suggested changes, thus versions will not change, added or not added, across branches@terrancedejesus @Mikaayenson Copying this from slack:
We havenโt worked on adding support for source_updated_at
on our side yet. Moreover, I guess itโs still unclear if we want to have a top-level field source_updated_at
or an object with a field source.updated_at
to make it future-proof (potentially, for DaC). @jpdjere should include a proposal for this fieldโs schema into the RFC, the goal is to complete it by the end of this week.
I guess it's not a big difference so this shouldn't block you from working on some implementation of this field, but please hold off merging anything until we approve the proposal on our side and get an approval from your side.
@banderror Feel free to ping when you're ready to pick this back up and we'll try to resource/prioritize it.
Is your feature request related to a problem? Please describe. No.
Describe the solution you'd like Add
creation_date
andupdated_date
to rule objects when a release package is created.Additional context When we build a rules release package, all rule objects should have a
creation_date
andupdated_date
field in them. This will be used by Kibana for the updates review workflow.@jpdjere @approksiu
Dev branch: https://github.com/elastic/detection-rules/tree/fr-add-dates-to-rule-data