Open YYM0093 opened 11 months ago
It may be due to a memory overflow issue. To limit the memory usage, you can limit the parallelization of the computation using the "nbProcess" parameter. However, the computation will become slower, so you can monitor the memory usage during the computation and increase the value accordingly.
Example of code using "nbProcess" :
datTran = DatasetTransformation(folder, "BML.transform", "GraphFeatures")
datTran.setParams({
"global":{
"Name": "WeightedGraphFeatures",
"Period": 1,
"nbProcess": 1
}
})
Thank you for your valuable advice, I tried your method normally but it was a bit slow, but that was definitely worth it to get the results I wanted!
I'm sorry to bother you again, but I added the "nbProcess": 1 as you suggested, and after running the program overnight, the program reports the following error message (the program continues to run and is only 1/24th complete), which seems to be a multi-threading related error? Or is it a missing 'number_of_cliques' and 'node_clique_number'? :
Process Process-22:1315:
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/jovyan/BML/BML/transform/graph.py", line 144, in runTransforms
data[index] = self.transforms(index, G)
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 436, in transforms
results = self.computeFeatures(G, features_nx, features_nk)
File "/home/jovyan/BML/BML/transform/graph_features.py", line 187, in computeFeatures
return(NodesFeatures.computeFeatures(self, G, features_nx, features_nk))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 422, in computeFeatures
results.update(computeFeaturesParallelized(features_nx, self.params["nbProcessFeatures"], self.logFiles, self.params["verbose"]))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 322, in computeFeaturesParallelized
r_copy[k] = results[k].copy()
File "<string>", line 2, in __getitem__
File "/opt/conda/lib/python3.9/multiprocessing/managers.py", line 825, in _callmethod
raise convert_to_error(kind, result)
KeyError: 'number_of_cliques'
Process Process-22:1317:
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/jovyan/BML/BML/transform/graph.py", line 144, in runTransforms
data[index] = self.transforms(index, G)
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 436, in transforms
results = self.computeFeatures(G, features_nx, features_nk)
File "/home/jovyan/BML/BML/transform/graph_features.py", line 187, in computeFeatures
return(NodesFeatures.computeFeatures(self, G, features_nx, features_nk))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 422, in computeFeatures
results.update(computeFeaturesParallelized(features_nx, self.params["nbProcessFeatures"], self.logFiles, self.params["verbose"]))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 322, in computeFeaturesParallelized
r_copy[k] = results[k].copy()
File "<string>", line 2, in __getitem__
File "/opt/conda/lib/python3.9/multiprocessing/managers.py", line 825, in _callmethod
raise convert_to_error(kind, result)
KeyError: 'node_clique_number'
Process Process-22:1319:
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/jovyan/BML/BML/transform/graph.py", line 144, in runTransforms
data[index] = self.transforms(index, G)
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 436, in transforms
results = self.computeFeatures(G, features_nx, features_nk)
File "/home/jovyan/BML/BML/transform/graph_features.py", line 187, in computeFeatures
return(NodesFeatures.computeFeatures(self, G, features_nx, features_nk))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 422, in computeFeatures
results.update(computeFeaturesParallelized(features_nx, self.params["nbProcessFeatures"], self.logFiles, self.params["verbose"]))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 322, in computeFeaturesParallelized
r_copy[k] = results[k].copy()
File "<string>", line 2, in __getitem__
File "/opt/conda/lib/python3.9/multiprocessing/managers.py", line 825, in _callmethod
raise convert_to_error(kind, result)
KeyError: 'node_clique_number'
Process Process-22:1653:
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/jovyan/BML/BML/transform/graph.py", line 144, in runTransforms
data[index] = self.transforms(index, G)
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 436, in transforms
results = self.computeFeatures(G, features_nx, features_nk)
File "/home/jovyan/BML/BML/transform/graph_features.py", line 187, in computeFeatures
return(NodesFeatures.computeFeatures(self, G, features_nx, features_nk))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 422, in computeFeatures
results.update(computeFeaturesParallelized(features_nx, self.params["nbProcessFeatures"], self.logFiles, self.params["verbose"]))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 322, in computeFeaturesParallelized
r_copy[k] = results[k].copy()
File "<string>", line 2, in __getitem__
File "/opt/conda/lib/python3.9/multiprocessing/managers.py", line 825, in _callmethod
raise convert_to_error(kind, result)
KeyError: 'node_clique_number'
Process Process-22:1655:
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/jovyan/BML/BML/transform/graph.py", line 144, in runTransforms
data[index] = self.transforms(index, G)
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 436, in transforms
results = self.computeFeatures(G, features_nx, features_nk)
File "/home/jovyan/BML/BML/transform/graph_features.py", line 187, in computeFeatures
return(NodesFeatures.computeFeatures(self, G, features_nx, features_nk))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 422, in computeFeatures
results.update(computeFeaturesParallelized(features_nx, self.params["nbProcessFeatures"], self.logFiles, self.params["verbose"]))
File "/home/jovyan/BML/BML/transform/nodes_features.py", line 322, in computeFeaturesParallelized
r_copy[k] = results[k].copy()
File "<string>", line 2, in __getitem__
File "/opt/conda/lib/python3.9/multiprocessing/managers.py", line 825, in _callmethod
raise convert_to_error(kind, result)
KeyError: 'node_clique_number'
Process Process-22:
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/conda/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/jovyan/BML/BML/transform/dataset_transformation.py", line 31, in transformSample
transform(transformation, primingFile, dataFile, params=params, outFolder=outputfolder, logFiles=logFiles)
File "/home/jovyan/BML/BML/transform/base_transform.py", line 243, in transform
transform.execute()
File "/home/jovyan/BML/BML/transform/base_transform.py", line 174, in execute
self.compute()
File "/home/jovyan/BML/BML/transform/base_transform.py", line 215, in compute
BaseTransform.compute(self)
File "/home/jovyan/BML/BML/transform/base_transform.py", line 106, in compute
self.computeSnapshot(t, routes, updatesParsed)
File "/home/jovyan/BML/BML/transform/graph.py", line 126, in computeSnapshot
self.pq.addProcess(target=self.runTransforms, args=(self.data, i, self.data[i]))
File "<string>", line 2, in __getitem__
File "/opt/conda/lib/python3.9/multiprocessing/managers.py", line 825, in _callmethod
raise convert_to_error(kind, result)
KeyError: 827
Sorry for the delayed answer.
In fact, nbProcess
set the number of graphs that are processed in parallel. However, there is still parallelization for the computation of the features on a single graph. The nbProcessFeatures
parameter can be used to fix that. But it shouldn't impact the memory usage...
How much memory did you have ?
Sorry I've only just seen your reply, after I switched to a computer with 60GiB of RAM, the problem I was having was solved and I've now collected my target dataset without any problems, thank you very much for your reply and for your BML!
Hello, I have successfully collected the data I want and transformed it into statistical features, my aim is to replicate the model from multiple BGP anomaly detection literature, they collected the statistical features as shown in the figure, but the features I have converted using BML do not seem to contain all of them (e.g. Number of IGP packets, Number of EGP packets, Number of incomplete packets), and I checked the source code of BML and it doesn't seem to be collected either? Can you provide more details on the features collected by BML? I'm new to this area of BGP and may not understand the abbreviation of features in BML well, I apologize for that.
Hi, I am having a little problem collecting and transforming data using BML, I have modified the code based on your example and it is able to collect the data successfully, but when I run the code for the graph feature transformation, it reports the following error and then the content in the .json file will be empty, here is a screenshot of my code and the error reported.
"#################
Data collection
folder = "single_data/TTNet/" dataset = Dataset(folder)
dataset.setParams({ "PrimingPeriod": 10*60, # 10 hours of priming data "IpVersion": [4], # only IPv4 routes "Collectors": ["rrc04","rrc05"], "UseRibsPriming": True })
dataset.setPeriodsOfInterests([ { "name": "TTNet", "label": "anomaly", "start_time": utils.getTimestamp(2004, 12, 24, 9, 20, 0) - 6030, "end_time": utils.getTimestamp(2004, 12, 24, 9, 20, 0) + 6030, }, { "name": "TTNet", "label": "no_anomaly_1", "start_time": utils.getTimestamp(2004, 12, 24, 9, 20, 0) - 6030 - 243600, "end_time": utils.getTimestamp(2004, 12, 24, 9, 20, 0) - 6030, }, { "name": "TTNet", "label": "no_anomaly_2", "start_time": utils.getTimestamp(2004, 12, 24, 9, 20, 0) + 6030, "end_time": utils.getTimestamp(2004, 12, 24, 9, 20, 0) + 6030 + 243600, }, ])
run the data collection
utils.runJobs(dataset.getJobs(), folder+"collect_jobs", nbProcess=3)
features extraction every 2 minute
datTran = DatasetTransformation(folder, "BML.transform", "GraphFeatures")
datTran.setParams({ "global":{ "Name": "WeightedGraphFeatures", "Period": 1, } })
run the data transformation
utils.runJobs(datTran.getJobs(), folder+"transform_jobs") "