Open landauermax opened 1 month ago
could you please extend the test by having a line with a new status code (for example as the last one)? The output seems to be fine, because you should get an anomaly for the third line. To verify if there really is a bug, please also post the output of the persisted values for the NMPVD - if that file only contains values for the status codes 200 and 404, then there is no bug.
I added a line with status code 999:
2021-05-17 13:25:14 New path(es) detected
NewMatchPathDetector: "DefaultNewMatchPathDetector" (1 lines)
/accesslog: 127.0.0.1 - - [17/May/2021:11:25:14 +0000] "GET / HTTP/1.1" 200 11229 "-" "Wget/1.20.3 (linux-gnu)"
/accesslog/host: 127.0.0.1
/accesslog/sp0:
/accesslog/ident: -
/accesslog/sp1:
/accesslog/user: -
/accesslog/sp2:
/accesslog/time: 1621250714
/accesslog/sp3: ] "
/accesslog/fm/request: GET / HTTP/1.1
/accesslog/fm/request/method: 0
/accesslog/fm/request/sp5:
/accesslog/fm/request/request: /
/accesslog/fm/request/sp6:
/accesslog/fm/request/version: HTTP/1.1
/accesslog/sp6: "
/accesslog/status: 200
/accesslog/sp7:
/accesslog/size: 11229
/accesslog/combined: "-" "Wget/1.20.3 (linux-gnu)"
/accesslog/combined/combined: "-" "Wget/1.20.3 (linux-gnu)"
/accesslog/combined/combined/sp9: "
/accesslog/combined/combined/referer: -
/accesslog/combined/combined/sp10: " "
/accesslog/combined/combined/user_agent: Wget/1.20.3 (linux-gnu)
/accesslog/combined/combined/sp11: "
['/accesslog', '/accesslog/host', '/accesslog/sp0', '/accesslog/ident', '/accesslog/sp1', '/accesslog/user', '/accesslog/sp2', '/accesslog/time', '/accesslog/sp3', '/accesslog/fm/request', '/accesslog/sp6', '/accesslog/status', '/accesslog/sp7', '/accesslog/size', '/accesslog/combined', '/accesslog/combined/combined', '/accesslog/combined/combined/sp9', '/accesslog/combined/combined/referer', '/accesslog/combined/combined/sp10', '/accesslog/combined/combined/user_agent', '/accesslog/combined/combined/sp11', '/accesslog/fm/request/method', '/accesslog/fm/request/sp5', '/accesslog/fm/request/request', '/accesslog/fm/request/sp6', '/accesslog/fm/request/version']
127.0.0.1 - - [17/May/2021:11:25:14 +0000] "GET / HTTP/1.1" 200 11229 "-" "Wget/1.20.3 (linux-gnu)"
2021-05-17 13:25:14 New value(s) detected
NewMatchPathValueDetector: "NewMatchPathValueDetector2" (1 lines)
{'/accesslog/status': 200}
127.0.0.1 - - [17/May/2021:11:25:14 +0000] "GET / HTTP/1.1" 200 11229 "-" "Wget/1.20.3 (linux-gnu)"
2021-05-17 14:21:16 New value(s) detected
NewMatchPathValueDetector: "NewMatchPathValueDetector2" (1 lines)
{'/accesslog/status': 404}
127.0.0.1 - - [17/May/2021:12:21:16 +0000] "GET /doesntexist HTTP/1.1" 404 488 "-" "Wget/1.20.3 (linux-gnu)"
2021-05-17 15:23:48 New value(s) detected
NewMatchPathValueDetector: "NewMatchPathValueDetector2" (1 lines)
{'/accesslog/status': 300}
127.0.0.1 - - [17/May/2021:13:23:48 +0000] "GET / HTTP/1.1" 300 11229 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0"
2021-05-17 16:21:16 New value(s) detected
NewMatchPathValueDetector: "NewMatchPathValueDetector2" (1 lines)
{'/accesslog/status': 999}
127.0.0.1 - - [17/May/2021:14:21:16 +0000] "GET /doesntexist HTTP/1.1" 999 488 "-" "Wget/1.20.3 (linux-gnu)"
Persisted values:
cat /var/lib/aminer/NewMatchPathValueDetector/Default
[200, 300, 404, 999]
The issue is that stop_learning_time does not work on historical data, because it is defined in InputInterfaces.AtomHandlerInterface as self.stop_learning_timestamp = time.time() + stop_learning_time
for each analysis component.
The issue why this can not be defined within each analysis component is that the learning should stop exactly after 3600 seconds, which would only work for the analysis component that has seen the first log atom.
So the feature works as intended, because we do not have any data at the initialization stage of the aminer.
The same bug also happens for the next_persist_time
of each analysis component.
To at least provide a solution for this issue, I introduce the AminerStartTimestamp
config property.
This timestamp also needs to be in the same timezone as the data (by default UTC and in this example also UTC).
If it is None, the value is set to time.time()
(same as it is right now)
The solution is not great, but it can be worked with.
Output after fix:
sudo aminer -c config.yml -C -o
/etc/aminer/conf-enabled/cerberus/__init__.py:13: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
from pkg_resources import get_distribution, DistributionNotFound
/etc/aminer/conf-enabled/cerberus/__init__.py:13: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
from pkg_resources import get_distribution, DistributionNotFound
2021-05-17 13:25:14 New path(es) detected
NewMatchPathDetector: "DefaultNewMatchPathDetector" (1 lines)
/accesslog: 127.0.0.1 - - [17/May/2021:11:25:14 +0000] "GET / HTTP/1.1" 200 11229 "-" "Wget/1.20.3 (linux-gnu)"
/accesslog/host: 127.0.0.1
/accesslog/sp0:
/accesslog/ident: -
/accesslog/sp1:
/accesslog/user: -
/accesslog/sp2:
/accesslog/time: 1621250714
/accesslog/sp3: ] "
/accesslog/fm/request: GET / HTTP/1.1
/accesslog/fm/request/method: 0
/accesslog/fm/request/sp5:
/accesslog/fm/request/request: /
/accesslog/fm/request/sp6:
/accesslog/fm/request/version: HTTP/1.1
/accesslog/sp6: "
/accesslog/status: 200
/accesslog/sp7:
/accesslog/size: 11229
/accesslog/combined: "-" "Wget/1.20.3 (linux-gnu)"
/accesslog/combined/combined: "-" "Wget/1.20.3 (linux-gnu)"
/accesslog/combined/combined/sp9: "
/accesslog/combined/combined/referer: -
/accesslog/combined/combined/sp10: " "
/accesslog/combined/combined/user_agent: Wget/1.20.3 (linux-gnu)
/accesslog/combined/combined/sp11: "
['/accesslog', '/accesslog/host', '/accesslog/sp0', '/accesslog/ident', '/accesslog/sp1', '/accesslog/user', '/accesslog/sp2', '/accesslog/time', '/accesslog/sp3', '/accesslog/fm/request', '/accesslog/sp6', '/accesslog/status', '/accesslog/sp7', '/accesslog/size', '/accesslog/combined', '/accesslog/combined/combined', '/accesslog/combined/combined/sp9', '/accesslog/combined/combined/referer', '/accesslog/combined/combined/sp10', '/accesslog/combined/combined/user_agent', '/accesslog/combined/combined/sp11', '/accesslog/fm/request/method', '/accesslog/fm/request/sp5', '/accesslog/fm/request/request', '/accesslog/fm/request/sp6', '/accesslog/fm/request/version']
127.0.0.1 - - [17/May/2021:11:25:14 +0000] "GET / HTTP/1.1" 200 11229 "-" "Wget/1.20.3 (linux-gnu)"
2021-05-17 13:25:14 New value(s) detected
NewMatchPathValueDetector: "NewMatchPathValueDetector2" (1 lines)
{'/accesslog/status': 200}
127.0.0.1 - - [17/May/2021:11:25:14 +0000] "GET / HTTP/1.1" 200 11229 "-" "Wget/1.20.3 (linux-gnu)"
2021-05-17 14:21:16 New value(s) detected
NewMatchPathValueDetector: "NewMatchPathValueDetector2" (1 lines)
{'/accesslog/status': 404}
127.0.0.1 - - [17/May/2021:12:21:16 +0000] "GET /doesntexist HTTP/1.1" 404 488 "-" "Wget/1.20.3 (linux-gnu)"
2021-05-17 15:23:48 New value(s) detected
NewMatchPathValueDetector: "NewMatchPathValueDetector2" (1 lines)
{'/accesslog/status': 300}
127.0.0.1 - - [17/May/2021:13:23:48 +0000] "GET / HTTP/1.1" 300 11229 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0"
2021-05-17 15:23:55 New value(s) detected
NewMatchPathValueDetector: "NewMatchPathValueDetector2" (1 lines)
{'/accesslog/status': 300}
127.0.0.1 - - [17/May/2021:13:23:55 +0000] "GET / HTTP/1.1" 300 11229 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3) Gecko/20090305 Firefox/3.1b3 GTB5"
2021-05-17 16:21:16 New value(s) detected
NewMatchPathValueDetector: "NewMatchPathValueDetector2" (1 lines)
{'/accesslog/status': 999}
127.0.0.1 - - [17/May/2021:14:21:16 +0000] "GET /doesntexist HTTP/1.1" 999 488 "-" "Wget/1.20.3 (linux-gnu)"
sudo cat /var/lib/aminer/log/aminer.log
22/May/2024:23:34:14 +0200 INFO aminer started.
22/May/2024:23:34:14 +0200 DEBUG Called add_time_triggered_component for the component AnalysisChild
22/May/2024:23:34:14 +0200 DEBUG Started with build_analysis_pipeline.
22/May/2024:23:34:14 +0200 DEBUG Registered component SubhandlerFilter with the id 0 and component_name 'AtomFilter'.
22/May/2024:23:34:14 +0200 ERROR [Errno 2] No such file or directory: b'NewMatchPathDetector/Default'
22/May/2024:23:34:14 +0200 DEBUG Called add_time_triggered_component for the component NewMatchPathDetector
22/May/2024:23:34:14 +0200 DEBUG Registered component NewMatchPathDetector with the id 1 and component_name 'DefaultNewMatchPathDetector'.
22/May/2024:23:34:14 +0200 ERROR [Errno 2] No such file or directory: b'NewMatchPathValueDetector/Default'
22/May/2024:23:34:14 +0200 DEBUG Called add_time_triggered_component for the component NewMatchPathValueDetector
22/May/2024:23:34:14 +0200 DEBUG Registered component NewMatchPathValueDetector with the id 2 and component_name 'NewMatchPathValueDetector2'.
22/May/2024:23:34:14 +0200 ERROR [Errno 2] No such file or directory: b'AnalysisChild/RepositioningData'
22/May/2024:23:34:14 +0200 DEBUG Repositioning data was persisted.
22/May/2024:23:34:14 +0200 DEBUG NewMatchPathDetector persisted data.
22/May/2024:23:34:14 +0200 DEBUG NewMatchPathValueDetector persisted data.
22/May/2024:23:34:14 +0200 DEBUG Statistics logs are written..
22/May/2024:23:34:14 +0200 INFO Stopping learning in the NewMatchPathValueDetector.
22/May/2024:23:34:14 +0200 DEBUG NewMatchPathDetector persisted data.
22/May/2024:23:34:14 +0200 DEBUG NewMatchPathValueDetector persisted data.
sudo cat /var/lib/aminer/NewMatchPathValueDetector/Default
[200, 404]
I do not understand why time.time() is used - as discussed in issue https://github.com/ait-aecid/logdata-anomaly-miner/issues/1230 the stop_learning_time should be based on the time from the logs, and not the real time. Also, I think it is fine if the stop_learning_time is set using the first log atom that is processed by any of the analysis components.
Sorry, I don't really know what happened in #1230 - this was a year ago. We both just know that it is not working in the desired way at the moment.
Please help me clarifying the desired functionality: do you want to have one time at which all of the anomaly detectors stop (for example first seen log atom + 3600 seconds) or should each anomaly detector have it's own time when to stop (this means that one detector might stop after a hour and another detector might stop after 2 days).
I was working towards the former one, because the latter one seems too complicated to follow in the real world.
Because the detectors do not have access to other detectors, we needs to define the stopping time beforehand and do not have any timestamp from log data. That's why I would define the mentioned AminerStartTimestamp
config property.
If we implement the second example where the stop time is really managed by the detector, this is no issue for me (besides a lot of refactoring).
If we use the AminerStartTimestamp, we could just as well set the time where the AMiner switches the training mode in absolute timestamps. However, we introduced the stop_learning_time because we did not want to make it necessary to state an absolute timestamp when the AMiner should switch off the training mode - especially when the config is reused on several data sets, this will just be a big source of error.
I would pursue the second case, since this is how it was originally implemented and how the functionality worked after https://github.com/ait-aecid/logdata-anomaly-miner/issues/1230. Since each detector receives every log line (unless one starts using atom filters), it is usually the case that all detectors will switch learn mode at the same time as they all receive the same first line.
Original issue: https://github.com/ait-aecid/logdata-anomaly-miner/issues/831 Fix: https://github.com/ait-aecid/logdata-anomaly-miner/issues/1230
Now this feature is not working anymore.
Set learning to stop after 1 hour:
For the first two lines, learning should be active, then switched off:
However, learning is never switched off (300 should be detected twice as new value):