C++ implementation of
The old implementation is in another branch OldImplementation
, it should be considered as being archived and will hardly receive feature updates.
If you use Windows:
cd
to the project root MIDAS/
cmake -DCMAKE_BUILD_TYPE=Release -GNinja -S . -B build/release
cmake --build build/release --target Demo
cd
to MIDAS/build/release/
.\Demo.exe
If you use Linux/macOS:
cd
to the project root MIDAS/
cmake -DCMAKE_BUILD_TYPE=Release -S . -B build/release
cmake --build build/release --target Demo
cd
to MIDAS/build/release/
./Demo
The demo runs on MIDAS/data/DARPA/darpa_processed.csv
, which has 4.5M records, with the filtering core (MIDAS-F).
The scores will be exported to MIDAS/temp/Score.txt
, higher means more anomalous.
All file paths are absolute and "hardcoded" by CMake, but it's suggested NOT to run by double clicking on the executable file.
Core
Demo (if experimental ROC-AUC impl)
Demo (if sklearn
ROC-AUC impl)
MIDAS/util/EvaluateScore.py
)
pandas
: I/O scikit-learn
: Compute ROC-AUCExperiment
Other python utility scripts
pandas
scikit-learn
sklearn
ROC-AUC ImplementationIn MIDAS/example/Demo.cpp
.
Comment out section "Evaluate scores (experimental)"
Uncomment section "Write output scores" and "Evaluate scores".
Those are arguments of cores' constructors, which are at MIDAS/example/Demo.cpp:67-69
.
Cores are instantiated at MIDAS/example/Demo.cpp:67-69
, uncomment the chosen one.
Demo.cpp
You need to prepare three files:
N
, the number of records in the datasetpathMeta
MIDAS/data/DARPA/darpa_shape.txt
[N,3]
pathData
MIDAS/data/DARPA/darpa_processed.csv
[N,1]
pathGroundTruth
MIDAS/data/DARPA/darpa_ground_truth.csv
MIDAS/src/NormalCore.hpp
, MIDAS/src/RelationalCore.hpp
or MIDAS/src/FilteringCore.hpp
operator()
on individual data records, it returns the anomaly score for the input recordexample/
Experiment.cpp
The code we used for experiments.
It will try to use Intel TBB or OpenMP for parallelization.
You should comment all but only one runner function call in the main()
as most results are exported to MIDAS/temp/Experiiment.csv
together with many intermediate files.
Reproducible.cpp
Similar to Demo.cpp
, but with all random parameters hardcoded and always produce the same result.
It's for other developers and us to test if the implementation in other languages can produce acceptable results.
util/
DeleteTempFile.py
, EvaluateScore.py
and ReproduceROC.py
will show their usage and a short description when executed without any argument.
AUROC.hpp
Experimental ROC-AUC implementation in C++11. More info at this repo.
PreprocessData.py
The code to process the raw dataset into an easy-to-read format.
Datasets are always assumed to be in a folder in MIDAS/data/
.
It can process the following dataset(s)
DARPA/darpa_original.csv
-> DARPA/darpa_processed.csv
, DARPA/darpa_ground_truth.csv
, DARPA/darpa_shape.txt
If you use this code for your research, please consider citing our TKDD and AAAI papers.
@article{bhatia2022realtime,
author = {Bhatia, Siddharth and Liu, Rui and Hooi, Bryan and Yoon, Minji and Shin, Kijung and Faloutsos, Christos},
title = {Real-Time Anomaly Detection in Edge Streams},
year = {2022},
issue_date = {August 2022},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {16},
number = {4},
issn = {1556-4681},
url = {https://doi.org/10.1145/3494564},
doi = {10.1145/3494564},
journal = {ACM Trans. Knowl. Discov. Data},
month = {jan},
articleno = {75},
numpages = {22}
}
@inproceedings{bhatia2020midas,
title={MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams},
author={Siddharth Bhatia and Bryan Hooi and Minji Yoon and Kijung Shin and Christos Faloutsos},
booktitle={AAAI Conference on Artificial Intelligence (AAAI)},
year={2020}
}