Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.38k stars 1.06k forks source link

TTL for Lookup Table Entries #14574

Closed ChristopherKB closed 1 year ago

ChristopherKB commented 1 year ago

A time to live feature for lookup table entries would be a very useful addition.

What?

The ability to assign a time to live, or duration setting for lookup table entries would allow us to automatically delete values that are no longer useful.

Why?

Some data has a useful life span. Things like policy violations on watchlists or entries on threat intelligence feeds often become less valuable or accurate over time. The ability to automatically delete these entries would enable new types of uses for lookup tables.

A specific example would be Indicators of Compromise. An IoC often appears and disappears over time. Since IoC lists are constantly updated, a TTL would allow for the automatic deletion of old entries so that lookup tables don't grow forever. Watchlists in particular would benefit from this feature.

Another example would be a check against a list of users who have had password changes in the past 72 hours. If those users get locked out of their account, the presence of their username on a lookup table would enable the analysts to treat these lockouts differently than a user who has not changed their password recently.

Your Environment

mpfz0r commented 1 year ago

@ChristopherKB but you can control that via the cache on the lookup table. Or am I missing something here?

image
bernd commented 1 year ago

@mpfz0r I think this is about removing entries from a MongoDB data adapter. Watchlists use the MongoDB data adapter to store watchlist entries. I think the request is specific to the MongoDB data adapter because usually, it's not possible to remove something from a data adapter.

ChristopherKB commented 1 year ago

You're both correct. The cache does allow you to expire a cached entry, but you cannot actually remove it from the data adapter, whether in Mongo, CSV or DSV. Adding this capability will allow both new functionality for time limited conditions and also help to keep large data sets like Threat Intelligence IoC's from growing out of control with outdated data.

Thanks.

On Thu, Feb 2, 2023 at 9:04 AM Bernd Ahlers @.***> wrote:

@mpfz0r https://github.com/mpfz0r I think this is about removing entries from a MongoDB data adapter. Watchlists use the MongoDB data adapter to store watchlist entries. I think the request is specific to the MongoDB data adapter because usually, it's not possible to remove something from a data adapter.

— Reply to this email directly, view it on GitHub https://github.com/Graylog2/graylog2-server/issues/14574#issuecomment-1413889084, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHHLEAR7HPMK7BYWLZW3WOLWVPEHXANCNFSM6AAAAAAUMQA6AE . You are receiving this because you were mentioned.Message ID: @.***>

drewmiranda-gl commented 1 year ago

There are definitely some use cases i can think of where this would be super useful.

ryan-carroll-graylog commented 1 year ago

@ChristopherKB @rich-graylog , would we want the lookup table entries to have both optional "Expire after access" and "Expire after write" (similar to caches)? And if both are enabled, the entry would be removed when at the earliest expiration? Or does just one or the other make more sense?

From an implementation perspective it doesn't really make a difference, so asking what would be more useful.

Additionally, would we want to display to users how long the entry has left to live? E.g: image

mpfz0r commented 1 year ago

I'm leaving some feedback here, because I don't think it will be found in testquality..

I'm wondering how we should treat a TTL of 0 in the lookup_assign_ttl function. wouldn't it make sense to delete the expireAfter field from the entry in this case?

It's a bummer that Mongo cannot set TTLs for single list values. I wonder if we can workaround that somehow. Expiring the entire list is not practial, IMO. Maybe we should refactor/extend the lookuptable Parameter query feature to use the results of an entire mongo collection from a data adapter and build an ES query with all the values instead.. :thinking: @kroepke ^^

ryan-carroll-graylog commented 1 year ago

I'm wondering how we should treat a TTL of 0 in the lookup_assign_ttl function. wouldn't it make sense to delete the expireAfter field from the entry in this case?

I think this makes sense and would be pretty straightforward to to do with the current implementation.

ryan-carroll-graylog commented 1 year ago

It's a bummer that Mongo cannot set TTLs for single list values. I wonder if we can workaround that somehow. Expiring the entire list is not practial, IMO. Maybe we should refactor/extend the lookuptable Parameter query feature to use the results of an entire mongo collection from a data adapter and build an ES query with all the values instead.. 🤔 @kroepke ^^

I'm not sure I understand what the lookuptable Parameter query feature is or how it's used, but another work around to get single list value TTLs might be: put the list values in a separate Mongo collection (MongoDBDataAdapterEntryListItem or something), then have the MongoDBDataAdapterEntryListItem documents reference a MongoDBDataAdapterEntry ID. Then each MongoDBDataAdapterEntryListItem would be able to have its own expire time?

That would require some non-trivial refactoring (of the UI and backend, as well as a migration probably) but should be doable.

mpfz0r commented 1 year ago

I'm not sure I understand what the lookuptable Parameter query feature is or how it's used

The idea is to automatically populate mongo lookuptable lists via a pipeline rules[1] and the lookup_add_string_list() function. The Parameter can then build ES lucene queries by joining those list entries. e.g source_ip:$lut_param$ becomes source_ip:("1.2.3.4" OR "5.6.7.8" OR "6.5.4.3") If those entries could have a TTL, it would make that feature more useful.

But yeah, we could change the lookuptable lists implementation to use a separate collection instead of a key with an array.

[1] I think the idea was to also run the pipelines from the events system, so we could for example flag IPs that have tried to many authentication failures or the like. But pipelines from events are not a thing yet.

ryan-carroll-graylog commented 1 year ago

The idea is to automatically populate mongo lookuptable lists via a pipeline rules[1] and the lookup_add_string_list() function. The Parameter can then build ES lucene queries by joining those list entries. e.g source_ip:$lut_param$ becomes source_ip:("1.2.3.4" OR "5.6.7.8" OR "6.5.4.3") If those entries could have a TTL, it would make that feature more useful.

Ah ok, thank you for the explanation. That does sound really useful and I definitely understand the motivation to expire list items separately.