ProvenanceAnalytics / kairos

55 stars 22 forks source link

Description and meaning of is_include_key_word unclear in the overall context. #4

Open robhta opened 1 year ago

robhta commented 1 year ago

The code includes the function is_include_key_word (see below, and link), which is used during the concatenation of individual Time windows. I did not find the description of the function in the paper. Have I overlooked them? I am a bit surprised about the function and would like to discuss its function/meaning.

According to the comment, it should filter out nodes that do not appear in the training/validation data (i.e. noise). This does not match the description in the paper and it cannot be known in advance what these are. In addition, nodes should be filtered that occur in the test data but do not contribute much to recognition. This can also not be known in advance. Furthermore, keywords such as netflow, var, usr, cadet occur very frequently.

The function can also be found in the other runs for the other data sets, with adjusted keywords in each case.

Experiments have shown that without any detection (without considering anomalousness of nodes) and only considering rareness and the keyword filter (and an adjusted threshold in the evaluation to 20 instead of 100) a detection of tn: 169, fp: 6, fn:0, fp:4 is possible.

The recognition performance without the keyword filter does not recognize any TW correctly. (Log outputs are attached)

Since experiments have shown that the function has a significant influence on recognition performance and a more detailed description is not available, I wanted to ask how it should be understood in the overall context.

Without Detection and Threshold 20 (instead of 100) -> (with 100 everything is negative nothing is detected)

2023-09-13 10:18:10 - INFO - Anomalous queue: ['2018-04-06_11:03:19.756210028~2018-04-06_11:18:26.126177915.txt', '2018-04-06_11:18:26.126177915~2018-04-06_11:33:35.116170745.txt', '2018-04-06_11:33:35.116170745~2018-04-06_11:48:42.606135188.txt', '2018-04-06_11:48:42.606135188~2018-04-06_12:03:50.186115455.txt', '2018-04-06_12:03:50.186115455~2018-04-06_14:01:32.489584227.txt']
2023-09-13 10:18:10 - INFO - Anomaly score: 25.806802004279533
2023-09-13 10:18:10 - INFO - Anomalous queue: ['2018-04-07_00:00:00.008778912~2018-04-07_00:15:00.638758012.txt', '2018-04-07_00:15:00.638758012~2018-04-07_00:30:00.678739107.txt', '2018-04-07_22:48:54.756943468~2018-04-07_23:03:54.806921896.txt', '2018-04-07_23:03:54.806921896~2018-04-07_23:20:00.056902847.txt', '2018-04-07_23:35:19.036879610~2018-04-07_23:50:19.096860042.txt']
2023-09-13 10:18:10 - INFO - Anomaly score: 28.338620976295744
2023-09-13 10:18:10 - INFO - tn: 169
2023-09-13 10:18:10 - INFO - fp: 6
2023-09-13 10:18:10 - INFO - fn: 0
2023-09-13 10:18:10 - INFO - tp: 4
2023-09-13 10:18:10 - INFO - precision: 0.4
2023-09-13 10:18:10 - INFO - recall: 1.0
2023-09-13 10:18:10 - INFO - fscore: 0.5714285714285715
2023-09-13 10:18:10 - INFO - accuracy: 0.9664804469273743
2023-09-13 10:18:10 - INFO - auc_val: 0.9828571428571429

Without Keyword-Filter:

2023-09-13 12:21:02 - INFO - Anomalous queue: ['2018-04-06_00:00:00.017083676~2018-04-06_00:15:50.177068871.txt', '2018-04-06_00:15:50.177068871~2018-04-06_00:30:58.967045191.txt', '2018-04-06_00:30:58.967045191~2018-04-06_00:47:07.167023337.txt', '2018-04-06_00:47:07.167023337~2018-04-06_01:02:14.677010991.txt', '2018-04-06_01:02:14.677010991~2018-04-06_01:17:19.056982719.txt', '2018-04-06_01:17:19.056982719~2018-04-06_01:32:29.456961759.txt', '2018-04-06_01:32:29.456961759~2018-04-06_01:47:34.436947077.txt', '2018-04-06_01:47:34.436947077~2018-04-06_02:02:43.106920556.txt', '2018-04-06_02:02:43.106920556~2018-04-06_02:17:48.476901849.txt', '2018-04-06_02:17:48.476901849~2018-04-06_02:33:48.406880632.txt', '2018-04-06_02:33:48.406880632~2018-04-06_02:49:07.976859935.txt', '2018-04-06_02:49:07.976859935~2018-04-06_03:05:04.836837911.txt', '2018-04-06_03:05:04.836837911~2018-04-06_03:21:23.736815129.txt', '2018-04-06_03:21:23.736815129~2018-04-06_03:37:30.946794613.txt', '2018-04-06_03:37:30.946794613~2018-04-06_03:52:38.836775414.txt', '2018-04-06_03:52:38.836775414~2018-04-06_04:07:44.916766067.txt', '2018-04-06_04:07:44.916766067~2018-04-06_04:22:56.346731444.txt', '2018-04-06_04:22:56.346731444~2018-04-06_04:38:03.996711152.txt', '2018-04-06_04:38:03.996711152~2018-04-06_04:54:10.956696707.txt', '2018-04-06_04:54:10.956696707~2018-04-06_05:09:19.686673819.txt', '2018-04-06_05:09:19.686673819~2018-04-06_05:24:21.256650907.txt', '2018-04-06_05:24:21.256650907~2018-04-06_05:40:34.436635353.txt', '2018-04-06_05:40:34.436635353~2018-04-06_05:56:42.656608208.txt', '2018-04-06_05:56:42.656608208~2018-04-06_06:11:49.026587588.txt', '2018-04-06_06:11:49.026587588~2018-04-06_06:26:59.116568845.txt', '2018-04-06_06:26:59.116568845~2018-04-06_06:42:05.616549747.txt', '2018-04-06_06:42:05.616549747~2018-04-06_06:57:14.796527090.txt', '2018-04-06_06:57:14.796527090~2018-04-06_07:12:21.486507686.txt', '2018-04-06_07:12:21.486507686~2018-04-06_07:28:29.926483734.txt', '2018-04-06_07:28:29.926483734~2018-04-06_07:43:36.186466909.txt', '2018-04-06_07:43:36.186466909~2018-04-06_07:58:45.906442356.txt', '2018-04-06_07:58:45.906442356~2018-04-06_08:14:50.596423582.txt', '2018-04-06_08:14:50.596423582~2018-04-06_08:30:01.506400452.txt', '2018-04-06_08:30:01.506400452~2018-04-06_08:45:17.736385465.txt', '2018-04-06_08:45:17.736385465~2018-04-06_09:01:17.176359680.txt', '2018-04-06_09:01:17.176359680~2018-04-06_09:16:25.066340315.txt', '2018-04-06_09:16:25.066340315~2018-04-06_09:31:25.146320510.txt', '2018-04-06_09:31:25.146320510~2018-04-06_09:46:40.966296805.txt', '2018-04-06_09:46:40.966296805~2018-04-06_10:01:47.476279293.txt', '2018-04-06_10:01:47.476279293~2018-04-06_10:17:55.186257332.txt', '2018-04-06_10:17:55.186257332~2018-04-06_10:33:00.136241644.txt', '2018-04-06_10:33:00.136241644~2018-04-06_10:48:11.796216358.txt', '2018-04-06_10:48:11.796216358~2018-04-06_11:03:19.756210028.txt', '2018-04-06_11:03:19.756210028~2018-04-06_11:18:26.126177915.txt', '2018-04-06_11:18:26.126177915~2018-04-06_11:33:35.116170745.txt', '2018-04-06_11:33:35.116170745~2018-04-06_11:48:42.606135188.txt', '2018-04-06_11:48:42.606135188~2018-04-06_12:03:50.186115455.txt', '2018-04-06_12:03:50.186115455~2018-04-06_14:01:32.489584227.txt', '2018-04-06_14:01:32.489584227~2018-04-06_14:16:39.379560570.txt', '2018-04-06_14:16:39.379560570~2018-04-06_14:31:41.149538564.txt', '2018-04-06_14:31:41.149538564~2018-04-06_14:46:47.869522833.txt', '2018-04-06_14:46:47.869522833~2018-04-06_15:02:04.169502159.txt', '2018-04-06_15:02:04.169502159~2018-04-06_15:17:11.749480935.txt', '2018-04-06_15:17:11.749480935~2018-04-06_15:32:19.019459138.txt', '2018-04-06_15:32:19.019459138~2018-04-06_15:47:25.559439120.txt', '2018-04-06_15:47:25.559439120~2018-04-06_16:02:51.629418719.txt', '2018-04-06_16:02:51.629418719~2018-04-06_16:18:33.979401357.txt', '2018-04-06_16:18:33.979401357~2018-04-06_16:34:47.879379108.txt', '2018-04-06_16:34:47.879379108~2018-04-06_16:50:08.479356930.txt', '2018-04-06_16:50:08.479356930~2018-04-06_17:05:29.239333653.txt', '2018-04-06_17:05:29.239333653~2018-04-06_17:20:51.639321683.txt', '2018-04-06_17:20:51.639321683~2018-04-06_17:35:54.449302990.txt', '2018-04-06_17:35:54.449302990~2018-04-06_17:51:58.069270063.txt', '2018-04-06_17:51:58.069270063~2018-04-06_18:08:36.779250443.txt', '2018-04-06_18:08:36.779250443~2018-04-06_18:24:17.369229503.txt', '2018-04-06_18:24:17.369229503~2018-04-06_18:40:52.729207049.txt', '2018-04-06_18:40:52.729207049~2018-04-06_18:55:59.999193978.txt', '2018-04-06_18:55:59.999193978~2018-04-06_19:11:08.459167047.txt', '2018-04-06_19:11:08.459167047~2018-04-06_19:26:14.719145309.txt', '2018-04-06_19:26:14.719145309~2018-04-06_19:41:23.549124701.txt', '2018-04-06_19:41:23.549124701~2018-04-06_19:57:28.259104051.txt', '2018-04-06_19:57:28.259104051~2018-04-06_20:13:39.769084097.txt', '2018-04-06_20:13:39.769084097~2018-04-06_20:28:47.019060830.txt', '2018-04-06_20:28:47.019060830~2018-04-06_20:43:57.369040719.txt', '2018-04-06_20:43:57.369040719~2018-04-06_20:59:02.029030474.txt', '2018-04-06_20:59:02.029030474~2018-04-06_21:15:00.619010803.txt', '2018-04-06_21:15:00.619010803~2018-04-06_21:30:00.778978760.txt', '2018-04-06_21:30:00.778978760~2018-04-06_21:45:25.218957773.txt', '2018-04-06_21:45:25.218957773~2018-04-06_22:01:32.928937945.txt', '2018-04-06_22:01:32.928937945~2018-04-06_22:17:39.688914711.txt', '2018-04-06_22:17:39.688914711~2018-04-06_22:32:48.168897833.txt', '2018-04-06_22:32:48.168897833~2018-04-06_22:48:06.838874079.txt', '2018-04-06_22:48:06.838874079~2018-04-06_23:03:09.188854937.txt', '2018-04-06_23:03:09.188854937~2018-04-06_23:18:10.168840981.txt', '2018-04-06_23:18:10.168840981~2018-04-06_23:33:15.408814807.txt', '2018-04-06_23:33:15.408814807~2018-04-06_23:49:23.948792776.txt']
2023-09-13 12:21:02 - INFO - Anomaly score: 6.024110317589681e+51
2023-09-13 12:21:02 - INFO - Anomalous queue: ['2018-04-07_00:00:00.008778912~2018-04-07_00:15:00.638758012.txt', '2018-04-07_00:15:00.638758012~2018-04-07_00:30:00.678739107.txt', '2018-04-07_00:30:00.678739107~2018-04-07_00:45:48.008718917.txt', '2018-04-07_00:45:48.008718917~2018-04-07_01:01:00.048699594.txt', '2018-04-07_01:01:00.048699594~2018-04-07_01:17:08.118674396.txt', '2018-04-07_01:17:08.118674396~2018-04-07_01:32:15.738656469.txt', '2018-04-07_01:32:15.738656469~2018-04-07_01:48:06.728632339.txt', '2018-04-07_01:48:06.728632339~2018-04-07_02:04:30.508619128.txt', '2018-04-07_02:04:30.508619128~2018-04-07_02:20:37.378590706.txt', '2018-04-07_02:20:37.378590706~2018-04-07_02:36:02.398566241.txt', '2018-04-07_02:36:02.398566241~2018-04-07_02:51:53.698546766.txt', '2018-04-07_02:51:53.698546766~2018-04-07_03:07:00.248525720.txt', '2018-04-07_03:07:00.248525720~2018-04-07_03:23:07.518503381.txt', '2018-04-07_03:23:07.518503381~2018-04-07_03:38:21.038486955.txt', '2018-04-07_03:38:21.038486955~2018-04-07_03:54:12.858461862.txt', '2018-04-07_03:54:12.858461862~2018-04-07_04:09:33.798446888.txt', '2018-04-07_04:09:33.798446888~2018-04-07_04:24:41.418423834.txt', '2018-04-07_04:24:41.418423834~2018-04-07_04:39:48.788404454.txt', '2018-04-07_04:39:48.788404454~2018-04-07_04:55:33.038382167.txt', '2018-04-07_04:55:33.038382167~2018-04-07_05:11:04.578374282.txt', '2018-04-07_05:11:04.578374282~2018-04-07_05:26:27.198342157.txt', '2018-04-07_05:26:27.198342157~2018-04-07_05:43:18.328318581.txt', '2018-04-07_05:43:18.328318581~2018-04-07_05:59:26.278297788.txt', '2018-04-07_05:59:26.278297788~2018-04-07_06:14:36.128277413.txt', '2018-04-07_06:14:36.128277413~2018-04-07_06:29:43.938255177.txt', '2018-04-07_06:29:43.938255177~2018-04-07_06:44:51.448236248.txt', '2018-04-07_06:44:51.448236248~2018-04-07_07:00:00.038221478.txt', '2018-04-07_07:00:00.038221478~2018-04-07_07:15:05.688195960.txt', '2018-04-07_07:15:05.688195960~2018-04-07_07:30:14.418177837.txt', '2018-04-07_07:30:14.418177837~2018-04-07_07:45:21.878154598.txt', '2018-04-07_07:45:21.878154598~2018-04-07_08:00:29.578138194.txt', '2018-04-07_08:00:29.578138194~2018-04-07_08:15:57.788117378.txt', '2018-04-07_08:15:57.788117378~2018-04-07_08:32:10.258101102.txt', '2018-04-07_08:32:10.258101102~2018-04-07_08:47:18.038077804.txt', '2018-04-07_08:47:18.038077804~2018-04-07_09:03:01.168052225.txt', '2018-04-07_09:03:01.168052225~2018-04-07_09:19:07.038030612.txt', '2018-04-07_09:19:07.038030612~2018-04-07_09:35:00.558007579.txt', '2018-04-07_09:35:00.558007579~2018-04-07_09:50:24.827995819.txt', '2018-04-07_09:50:24.827995819~2018-04-07_10:06:03.197964940.txt', '2018-04-07_10:06:03.197964940~2018-04-07_10:21:34.747944381.txt', '2018-04-07_10:21:34.747944381~2018-04-07_10:36:44.657924040.txt', '2018-04-07_10:36:44.657924040~2018-04-07_10:51:55.367905007.txt', '2018-04-07_10:51:55.367905007~2018-04-07_11:07:01.987882382.txt', '2018-04-07_11:07:01.987882382~2018-04-07_11:22:08.247863262.txt', '2018-04-07_11:22:08.247863262~2018-04-07_11:37:19.117845222.txt', '2018-04-07_11:37:19.117845222~2018-04-07_11:52:25.757824807.txt', '2018-04-07_11:52:25.757824807~2018-04-07_12:07:33.167805575.txt', '2018-04-07_12:07:33.167805575~2018-04-07_12:22:37.237783089.txt', '2018-04-07_12:22:37.237783089~2018-04-07_12:37:47.867767543.txt', '2018-04-07_12:37:47.867767543~2018-04-07_12:52:55.687743440.txt', '2018-04-07_12:52:55.687743440~2018-04-07_13:08:03.297721130.txt', '2018-04-07_13:08:03.297721130~2018-04-07_13:23:07.587702371.txt', '2018-04-07_13:23:07.587702371~2018-04-07_13:38:18.347682404.txt', '2018-04-07_13:38:18.347682404~2018-04-07_13:53:24.767661961.txt', '2018-04-07_13:53:24.767661961~2018-04-07_14:08:33.897638680.txt', '2018-04-07_14:08:33.897638680~2018-04-07_14:24:26.557620008.txt', '2018-04-07_14:24:26.557620008~2018-04-07_14:39:48.517612046.txt', '2018-04-07_14:39:48.517612046~2018-04-07_14:55:00.017576301.txt', '2018-04-07_14:55:00.017576301~2018-04-07_15:10:05.317560451.txt', '2018-04-07_15:10:05.317560451~2018-04-07_15:26:12.217536522.txt', '2018-04-07_15:26:12.217536522~2018-04-07_15:41:21.207514200.txt', '2018-04-07_15:41:21.207514200~2018-04-07_15:56:28.657494782.txt', '2018-04-07_15:56:28.657494782~2018-04-07_16:12:33.137472696.txt', '2018-04-07_16:12:33.137472696~2018-04-07_16:27:43.727457066.txt', '2018-04-07_16:27:43.727457066~2018-04-07_16:42:51.697434133.txt', '2018-04-07_16:42:51.697434133~2018-04-07_16:57:58.897414026.txt', '2018-04-07_16:57:58.897414026~2018-04-07_17:12:59.447391588.txt', '2018-04-07_17:12:59.447391588~2018-04-07_17:28:15.007384019.txt', '2018-04-07_17:28:15.007384019~2018-04-07_17:43:22.657350334.txt', '2018-04-07_17:43:22.657350334~2018-04-07_17:59:27.637330849.txt', '2018-04-07_17:59:27.637330849~2018-04-07_18:14:38.467310967.txt', '2018-04-07_18:14:38.467310967~2018-04-07_18:29:44.957289774.txt', '2018-04-07_18:29:44.957289774~2018-04-07_18:45:53.477266719.txt', '2018-04-07_18:45:53.477266719~2018-04-07_19:01:01.807254480.txt', '2018-04-07_19:01:01.807254480~2018-04-07_19:16:09.237229592.txt', '2018-04-07_19:16:09.237229592~2018-04-07_19:31:16.917209075.txt', '2018-04-07_19:31:16.917209075~2018-04-07_19:46:24.177188164.txt', '2018-04-07_19:46:24.177188164~2018-04-07_20:01:32.217166390.txt', '2018-04-07_20:01:32.217166390~2018-04-07_20:16:39.747146568.txt', '2018-04-07_20:16:39.747146568~2018-04-07_20:31:40.247126462.txt', '2018-04-07_20:31:40.247126462~2018-04-07_20:47:54.987106242.txt', '2018-04-07_20:47:54.987106242~2018-04-07_21:03:02.457083343.txt', '2018-04-07_21:03:02.457083343~2018-04-07_21:18:08.817071709.txt', '2018-04-07_21:18:08.817071709~2018-04-07_21:33:17.447051923.txt', '2018-04-07_21:33:17.447051923~2018-04-07_21:48:25.107022873.txt', '2018-04-07_21:48:25.107022873~2018-04-07_22:03:32.547001547.txt', '2018-04-07_22:03:32.547001547~2018-04-07_22:18:38.656982046.txt', '2018-04-07_22:18:38.656982046~2018-04-07_22:33:47.286964532.txt', '2018-04-07_22:33:47.286964532~2018-04-07_22:48:54.756943468.txt', '2018-04-07_22:48:54.756943468~2018-04-07_23:03:54.806921896.txt', '2018-04-07_23:03:54.806921896~2018-04-07_23:20:00.056902847.txt', '2018-04-07_23:20:00.056902847~2018-04-07_23:35:19.036879610.txt', '2018-04-07_23:35:19.036879610~2018-04-07_23:50:19.096860042.txt']
2023-09-13 12:21:02 - INFO - Anomaly score: 2.978702838425254e+56
2023-09-13 12:21:02 - INFO - tn: 0
2023-09-13 12:21:02 - INFO - fp: 175
2023-09-13 12:21:02 - INFO - fn: 0
2023-09-13 12:21:02 - INFO - tp: 4
2023-09-13 12:21:02 - INFO - precision: 0.0223463687150838
2023-09-13 12:21:02 - INFO - recall: 1.0
2023-09-13 12:21:02 - INFO - fscore: 0.04371584699453552
2023-09-13 12:21:02 - INFO - accuracy: 0.0223463687150838
2023-09-13 12:21:02 - INFO - auc_val: 0.5

Without complete Rareness Score (just Anomalousness) = same metrics (everything detected):

2023-09-13 12:29:34 - INFO - Anomalous queue: ['2018-04-06_00:00:00.017083676~2018-04-06_00:15:50.177068871.txt', '2018-04-06_00:15:50.177068871~2018-04-06_00:30:58.967045191.txt', '2018-04-06_00:30:58.967045191~2018-04-06_00:47:07.167023337.txt', '2018-04-06_00:47:07.167023337~2018-04-06_01:02:14.677010991.txt', '2018-04-06_01:02:14.677010991~2018-04-06_01:17:19.056982719.txt', '2018-04-06_01:17:19.056982719~2018-04-06_01:32:29.456961759.txt', '2018-04-06_01:32:29.456961759~2018-04-06_01:47:34.436947077.txt', '2018-04-06_01:47:34.436947077~2018-04-06_02:02:43.106920556.txt', '2018-04-06_02:02:43.106920556~2018-04-06_02:17:48.476901849.txt', '2018-04-06_02:17:48.476901849~2018-04-06_02:33:48.406880632.txt', '2018-04-06_02:33:48.406880632~2018-04-06_02:49:07.976859935.txt', '2018-04-06_02:49:07.976859935~2018-04-06_03:05:04.836837911.txt', '2018-04-06_03:05:04.836837911~2018-04-06_03:21:23.736815129.txt', '2018-04-06_03:21:23.736815129~2018-04-06_03:37:30.946794613.txt', '2018-04-06_03:37:30.946794613~2018-04-06_03:52:38.836775414.txt', '2018-04-06_03:52:38.836775414~2018-04-06_04:07:44.916766067.txt', '2018-04-06_04:07:44.916766067~2018-04-06_04:22:56.346731444.txt', '2018-04-06_04:22:56.346731444~2018-04-06_04:38:03.996711152.txt', '2018-04-06_04:38:03.996711152~2018-04-06_04:54:10.956696707.txt', '2018-04-06_04:54:10.956696707~2018-04-06_05:09:19.686673819.txt', '2018-04-06_05:09:19.686673819~2018-04-06_05:24:21.256650907.txt', '2018-04-06_05:24:21.256650907~2018-04-06_05:40:34.436635353.txt', '2018-04-06_05:40:34.436635353~2018-04-06_05:56:42.656608208.txt', '2018-04-06_05:56:42.656608208~2018-04-06_06:11:49.026587588.txt', '2018-04-06_06:11:49.026587588~2018-04-06_06:26:59.116568845.txt', '2018-04-06_06:26:59.116568845~2018-04-06_06:42:05.616549747.txt', '2018-04-06_06:42:05.616549747~2018-04-06_06:57:14.796527090.txt', '2018-04-06_06:57:14.796527090~2018-04-06_07:12:21.486507686.txt', '2018-04-06_07:12:21.486507686~2018-04-06_07:28:29.926483734.txt', '2018-04-06_07:28:29.926483734~2018-04-06_07:43:36.186466909.txt', '2018-04-06_07:43:36.186466909~2018-04-06_07:58:45.906442356.txt', '2018-04-06_07:58:45.906442356~2018-04-06_08:14:50.596423582.txt', '2018-04-06_08:14:50.596423582~2018-04-06_08:30:01.506400452.txt', '2018-04-06_08:30:01.506400452~2018-04-06_08:45:17.736385465.txt', '2018-04-06_08:45:17.736385465~2018-04-06_09:01:17.176359680.txt', '2018-04-06_09:01:17.176359680~2018-04-06_09:16:25.066340315.txt', '2018-04-06_09:16:25.066340315~2018-04-06_09:31:25.146320510.txt', '2018-04-06_09:31:25.146320510~2018-04-06_09:46:40.966296805.txt', '2018-04-06_09:46:40.966296805~2018-04-06_10:01:47.476279293.txt', '2018-04-06_10:01:47.476279293~2018-04-06_10:17:55.186257332.txt', '2018-04-06_10:17:55.186257332~2018-04-06_10:33:00.136241644.txt', '2018-04-06_10:33:00.136241644~2018-04-06_10:48:11.796216358.txt', '2018-04-06_10:48:11.796216358~2018-04-06_11:03:19.756210028.txt', '2018-04-06_11:03:19.756210028~2018-04-06_11:18:26.126177915.txt', '2018-04-06_11:18:26.126177915~2018-04-06_11:33:35.116170745.txt', '2018-04-06_11:33:35.116170745~2018-04-06_11:48:42.606135188.txt', '2018-04-06_11:48:42.606135188~2018-04-06_12:03:50.186115455.txt', '2018-04-06_12:03:50.186115455~2018-04-06_14:01:32.489584227.txt', '2018-04-06_14:01:32.489584227~2018-04-06_14:16:39.379560570.txt', '2018-04-06_14:16:39.379560570~2018-04-06_14:31:41.149538564.txt', '2018-04-06_14:31:41.149538564~2018-04-06_14:46:47.869522833.txt', '2018-04-06_14:46:47.869522833~2018-04-06_15:02:04.169502159.txt', '2018-04-06_15:02:04.169502159~2018-04-06_15:17:11.749480935.txt', '2018-04-06_15:17:11.749480935~2018-04-06_15:32:19.019459138.txt', '2018-04-06_15:32:19.019459138~2018-04-06_15:47:25.559439120.txt', '2018-04-06_15:47:25.559439120~2018-04-06_16:02:51.629418719.txt', '2018-04-06_16:02:51.629418719~2018-04-06_16:18:33.979401357.txt', '2018-04-06_16:18:33.979401357~2018-04-06_16:34:47.879379108.txt', '2018-04-06_16:34:47.879379108~2018-04-06_16:50:08.479356930.txt', '2018-04-06_16:50:08.479356930~2018-04-06_17:05:29.239333653.txt', '2018-04-06_17:05:29.239333653~2018-04-06_17:20:51.639321683.txt', '2018-04-06_17:20:51.639321683~2018-04-06_17:35:54.449302990.txt', '2018-04-06_17:35:54.449302990~2018-04-06_17:51:58.069270063.txt', '2018-04-06_17:51:58.069270063~2018-04-06_18:08:36.779250443.txt', '2018-04-06_18:08:36.779250443~2018-04-06_18:24:17.369229503.txt', '2018-04-06_18:24:17.369229503~2018-04-06_18:40:52.729207049.txt', '2018-04-06_18:40:52.729207049~2018-04-06_18:55:59.999193978.txt', '2018-04-06_18:55:59.999193978~2018-04-06_19:11:08.459167047.txt', '2018-04-06_19:11:08.459167047~2018-04-06_19:26:14.719145309.txt', '2018-04-06_19:26:14.719145309~2018-04-06_19:41:23.549124701.txt', '2018-04-06_19:41:23.549124701~2018-04-06_19:57:28.259104051.txt', '2018-04-06_19:57:28.259104051~2018-04-06_20:13:39.769084097.txt', '2018-04-06_20:13:39.769084097~2018-04-06_20:28:47.019060830.txt', '2018-04-06_20:28:47.019060830~2018-04-06_20:43:57.369040719.txt', '2018-04-06_20:43:57.369040719~2018-04-06_20:59:02.029030474.txt', '2018-04-06_20:59:02.029030474~2018-04-06_21:15:00.619010803.txt', '2018-04-06_21:15:00.619010803~2018-04-06_21:30:00.778978760.txt', '2018-04-06_21:30:00.778978760~2018-04-06_21:45:25.218957773.txt', '2018-04-06_21:45:25.218957773~2018-04-06_22:01:32.928937945.txt', '2018-04-06_22:01:32.928937945~2018-04-06_22:17:39.688914711.txt', '2018-04-06_22:17:39.688914711~2018-04-06_22:32:48.168897833.txt', '2018-04-06_22:32:48.168897833~2018-04-06_22:48:06.838874079.txt', '2018-04-06_22:48:06.838874079~2018-04-06_23:03:09.188854937.txt', '2018-04-06_23:03:09.188854937~2018-04-06_23:18:10.168840981.txt', '2018-04-06_23:18:10.168840981~2018-04-06_23:33:15.408814807.txt', '2018-04-06_23:33:15.408814807~2018-04-06_23:49:23.948792776.txt']
2023-09-13 12:29:34 - INFO - Anomaly score: 6.024110317589681e+51
2023-09-13 12:29:34 - INFO - Anomalous queue: ['2018-04-07_00:00:00.008778912~2018-04-07_00:15:00.638758012.txt', '2018-04-07_00:15:00.638758012~2018-04-07_00:30:00.678739107.txt', '2018-04-07_00:30:00.678739107~2018-04-07_00:45:48.008718917.txt', '2018-04-07_00:45:48.008718917~2018-04-07_01:01:00.048699594.txt', '2018-04-07_01:01:00.048699594~2018-04-07_01:17:08.118674396.txt', '2018-04-07_01:17:08.118674396~2018-04-07_01:32:15.738656469.txt', '2018-04-07_01:32:15.738656469~2018-04-07_01:48:06.728632339.txt', '2018-04-07_01:48:06.728632339~2018-04-07_02:04:30.508619128.txt', '2018-04-07_02:04:30.508619128~2018-04-07_02:20:37.378590706.txt', '2018-04-07_02:20:37.378590706~2018-04-07_02:36:02.398566241.txt', '2018-04-07_02:36:02.398566241~2018-04-07_02:51:53.698546766.txt', '2018-04-07_02:51:53.698546766~2018-04-07_03:07:00.248525720.txt', '2018-04-07_03:07:00.248525720~2018-04-07_03:23:07.518503381.txt', '2018-04-07_03:23:07.518503381~2018-04-07_03:38:21.038486955.txt', '2018-04-07_03:38:21.038486955~2018-04-07_03:54:12.858461862.txt', '2018-04-07_03:54:12.858461862~2018-04-07_04:09:33.798446888.txt', '2018-04-07_04:09:33.798446888~2018-04-07_04:24:41.418423834.txt', '2018-04-07_04:24:41.418423834~2018-04-07_04:39:48.788404454.txt', '2018-04-07_04:39:48.788404454~2018-04-07_04:55:33.038382167.txt', '2018-04-07_04:55:33.038382167~2018-04-07_05:11:04.578374282.txt', '2018-04-07_05:11:04.578374282~2018-04-07_05:26:27.198342157.txt', '2018-04-07_05:26:27.198342157~2018-04-07_05:43:18.328318581.txt', '2018-04-07_05:43:18.328318581~2018-04-07_05:59:26.278297788.txt', '2018-04-07_05:59:26.278297788~2018-04-07_06:14:36.128277413.txt', '2018-04-07_06:14:36.128277413~2018-04-07_06:29:43.938255177.txt', '2018-04-07_06:29:43.938255177~2018-04-07_06:44:51.448236248.txt', '2018-04-07_06:44:51.448236248~2018-04-07_07:00:00.038221478.txt', '2018-04-07_07:00:00.038221478~2018-04-07_07:15:05.688195960.txt', '2018-04-07_07:15:05.688195960~2018-04-07_07:30:14.418177837.txt', '2018-04-07_07:30:14.418177837~2018-04-07_07:45:21.878154598.txt', '2018-04-07_07:45:21.878154598~2018-04-07_08:00:29.578138194.txt', '2018-04-07_08:00:29.578138194~2018-04-07_08:15:57.788117378.txt', '2018-04-07_08:15:57.788117378~2018-04-07_08:32:10.258101102.txt', '2018-04-07_08:32:10.258101102~2018-04-07_08:47:18.038077804.txt', '2018-04-07_08:47:18.038077804~2018-04-07_09:03:01.168052225.txt', '2018-04-07_09:03:01.168052225~2018-04-07_09:19:07.038030612.txt', '2018-04-07_09:19:07.038030612~2018-04-07_09:35:00.558007579.txt', '2018-04-07_09:35:00.558007579~2018-04-07_09:50:24.827995819.txt', '2018-04-07_09:50:24.827995819~2018-04-07_10:06:03.197964940.txt', '2018-04-07_10:06:03.197964940~2018-04-07_10:21:34.747944381.txt', '2018-04-07_10:21:34.747944381~2018-04-07_10:36:44.657924040.txt', '2018-04-07_10:36:44.657924040~2018-04-07_10:51:55.367905007.txt', '2018-04-07_10:51:55.367905007~2018-04-07_11:07:01.987882382.txt', '2018-04-07_11:07:01.987882382~2018-04-07_11:22:08.247863262.txt', '2018-04-07_11:22:08.247863262~2018-04-07_11:37:19.117845222.txt', '2018-04-07_11:37:19.117845222~2018-04-07_11:52:25.757824807.txt', '2018-04-07_11:52:25.757824807~2018-04-07_12:07:33.167805575.txt', '2018-04-07_12:07:33.167805575~2018-04-07_12:22:37.237783089.txt', '2018-04-07_12:22:37.237783089~2018-04-07_12:37:47.867767543.txt', '2018-04-07_12:37:47.867767543~2018-04-07_12:52:55.687743440.txt', '2018-04-07_12:52:55.687743440~2018-04-07_13:08:03.297721130.txt', '2018-04-07_13:08:03.297721130~2018-04-07_13:23:07.587702371.txt', '2018-04-07_13:23:07.587702371~2018-04-07_13:38:18.347682404.txt', '2018-04-07_13:38:18.347682404~2018-04-07_13:53:24.767661961.txt', '2018-04-07_13:53:24.767661961~2018-04-07_14:08:33.897638680.txt', '2018-04-07_14:08:33.897638680~2018-04-07_14:24:26.557620008.txt', '2018-04-07_14:24:26.557620008~2018-04-07_14:39:48.517612046.txt', '2018-04-07_14:39:48.517612046~2018-04-07_14:55:00.017576301.txt', '2018-04-07_14:55:00.017576301~2018-04-07_15:10:05.317560451.txt', '2018-04-07_15:10:05.317560451~2018-04-07_15:26:12.217536522.txt', '2018-04-07_15:26:12.217536522~2018-04-07_15:41:21.207514200.txt', '2018-04-07_15:41:21.207514200~2018-04-07_15:56:28.657494782.txt', '2018-04-07_15:56:28.657494782~2018-04-07_16:12:33.137472696.txt', '2018-04-07_16:12:33.137472696~2018-04-07_16:27:43.727457066.txt', '2018-04-07_16:27:43.727457066~2018-04-07_16:42:51.697434133.txt', '2018-04-07_16:42:51.697434133~2018-04-07_16:57:58.897414026.txt', '2018-04-07_16:57:58.897414026~2018-04-07_17:12:59.447391588.txt', '2018-04-07_17:12:59.447391588~2018-04-07_17:28:15.007384019.txt', '2018-04-07_17:28:15.007384019~2018-04-07_17:43:22.657350334.txt', '2018-04-07_17:43:22.657350334~2018-04-07_17:59:27.637330849.txt', '2018-04-07_17:59:27.637330849~2018-04-07_18:14:38.467310967.txt', '2018-04-07_18:14:38.467310967~2018-04-07_18:29:44.957289774.txt', '2018-04-07_18:29:44.957289774~2018-04-07_18:45:53.477266719.txt', '2018-04-07_18:45:53.477266719~2018-04-07_19:01:01.807254480.txt', '2018-04-07_19:01:01.807254480~2018-04-07_19:16:09.237229592.txt', '2018-04-07_19:16:09.237229592~2018-04-07_19:31:16.917209075.txt', '2018-04-07_19:31:16.917209075~2018-04-07_19:46:24.177188164.txt', '2018-04-07_19:46:24.177188164~2018-04-07_20:01:32.217166390.txt', '2018-04-07_20:01:32.217166390~2018-04-07_20:16:39.747146568.txt', '2018-04-07_20:16:39.747146568~2018-04-07_20:31:40.247126462.txt', '2018-04-07_20:31:40.247126462~2018-04-07_20:47:54.987106242.txt', '2018-04-07_20:47:54.987106242~2018-04-07_21:03:02.457083343.txt', '2018-04-07_21:03:02.457083343~2018-04-07_21:18:08.817071709.txt', '2018-04-07_21:18:08.817071709~2018-04-07_21:33:17.447051923.txt', '2018-04-07_21:33:17.447051923~2018-04-07_21:48:25.107022873.txt', '2018-04-07_21:48:25.107022873~2018-04-07_22:03:32.547001547.txt', '2018-04-07_22:03:32.547001547~2018-04-07_22:18:38.656982046.txt', '2018-04-07_22:18:38.656982046~2018-04-07_22:33:47.286964532.txt', '2018-04-07_22:33:47.286964532~2018-04-07_22:48:54.756943468.txt', '2018-04-07_22:48:54.756943468~2018-04-07_23:03:54.806921896.txt', '2018-04-07_23:03:54.806921896~2018-04-07_23:20:00.056902847.txt', '2018-04-07_23:20:00.056902847~2018-04-07_23:35:19.036879610.txt', '2018-04-07_23:35:19.036879610~2018-04-07_23:50:19.096860042.txt']
2023-09-13 12:29:34 - INFO - Anomaly score: 2.978702838425254e+56
2023-09-13 12:29:34 - INFO - tn: 0
2023-09-13 12:29:34 - INFO - fp: 175
2023-09-13 12:29:34 - INFO - fn: 0
2023-09-13 12:29:34 - INFO - tp: 4
2023-09-13 12:29:34 - INFO - precision: 0.0223463687150838
2023-09-13 12:29:34 - INFO - recall: 1.0
2023-09-13 12:29:34 - INFO - fscore: 0.04371584699453552
2023-09-13 12:29:34 - INFO - accuracy: 0.0223463687150838
2023-09-13 12:29:34 - INFO - auc_val: 0.5

Function it concerns:

def is_include_key_word(s):
        # The following common nodes don't exist in the training/validation data, but
        # will have the influences to the construction of anomalous queue (i.e. noise).
        # These nodes frequently exist in the testing data but don't contribute much to
        # the detection (including temporary files or files with random name).
        # Assume the IDF can keep being updated with the new time windows, these
        # common nodes can be filtered out.
        keywords = [
          'netflow',
           '/home/george/Drafts',
           'usr',
           'proc',
           'var',
           'cadet',
           '/var/log/debug.log',
           '/var/log/cron',
           '/home/charles/Drafts',
           '/etc/ssl/cert.pem',
           '/tmp/.31.3022e',
        ]
        flag = False
        for i in keywords:
            if i in s:
                flag = True
        return flag

https://github.com/ProvenanceAnalytics/kairos/blame/0e0b633beb46a1117c0a6d63be5d2481b59ac0dc/DARPA/CADETS_E3/anomalous_queue_construction.py#L93

wlynn00 commented 1 year ago

I have the same confusion, it seems that the graph learning module did not play a significant role, but rather the noise filter in the rareness section played a crucial role