ait-aecid / logdata-anomaly-miner

This tool parses log data and allows to define analysis pipelines for anomaly detection. It was designed to run the analysis with limited resources and lowest possible permissions to make it suitable for production server use.
GNU General Public License v3.0
70 stars 23 forks source link

NewMatchPathValueComboDetector: Error in persistence when missing values occur #1314

Closed landauermax closed 2 months ago

landauermax commented 5 months ago

Running the NewMatchPathValueComboDetector with allow_missing_values set to True and then having both normal values and missing values (None) in the data results in an error when persisting:

Traceback (most recent call last):
  File "/usr/bin/aminer", line 794, in <module>
    main()
  File "/usr/bin/aminer", line 479, in main
    run_analysis_child(aminer_config, program_name)
  File "/usr/bin/aminer", line 89, in run_analysis_child
    child_return_status = child.run_analysis(3)
  File "/usr/lib/logdata-anomaly-miner/aminer/AnalysisChild.py", line 459, in run_analysis
    PersistenceUtil.persist_all()
  File "/usr/lib/logdata-anomaly-miner/aminer/util/PersistenceUtil.py", line 86, in persist_all
    component.do_persist()
  File "/usr/lib/logdata-anomaly-miner/aminer/analysis/NewMatchPathValueComboDetector.py", line 159, in do_persist
    PersistenceUtil.store_json(self.persistence_file_name, sorted(list(self.known_values_set)))
TypeError: '<' not supported between instances of 'NoneType' and 'bytes'

I assume this is because both None and other values are in the known_values_set, which cannot be sorted. Sorting is not necessary since this is a set, not a list.

ernstleierzopf commented 4 months ago

Running the NewMatchPathValueComboDetector with allow_missing_values set to True and then having both normal values and missing values (None) in the data results in an error when persisting:

Traceback (most recent call last):
  File "/usr/bin/aminer", line 794, in <module>
    main()
  File "/usr/bin/aminer", line 479, in main
    run_analysis_child(aminer_config, program_name)
  File "/usr/bin/aminer", line 89, in run_analysis_child
    child_return_status = child.run_analysis(3)
  File "/usr/lib/logdata-anomaly-miner/aminer/AnalysisChild.py", line 459, in run_analysis
    PersistenceUtil.persist_all()
  File "/usr/lib/logdata-anomaly-miner/aminer/util/PersistenceUtil.py", line 86, in persist_all
    component.do_persist()
  File "/usr/lib/logdata-anomaly-miner/aminer/analysis/NewMatchPathValueComboDetector.py", line 159, in do_persist
    PersistenceUtil.store_json(self.persistence_file_name, sorted(list(self.known_values_set)))
TypeError: '<' not supported between instances of 'NoneType' and 'bytes'

I assume this is because both None and other values are in the known_values_set, which cannot be sorted. Sorting is not necessary since this is a set, not a list.

Sorting of sets is not necessary, but it also is no problem, because sets are expected to be unsorted. Sorting these sets before persisting gives more reproducable results, which helps with testing. I think the performance overhead is not too big of a deal.