tuhh-softsec / code2DFD

Tool for the automatic extraction of dataflow diagrams from source code of microservices
Other
5 stars 4 forks source link

Code2DFD

Code2DFD can automatically extract dataflow diagrams (DFDs) that are enriched with security-relevant annotations from the source code of microservice applications. It is structured as a framework, where the technology-specific extractors in technology_specific_extractors/ are executed and detect evidence for DFD items in the code. They use some general functionality from core/.

The tool and underlying approach are presented in a publication in the Journal of Systems and Software (JSS). You can find the paper on arXiv or the publisher's website. If you use the tool in a scientific context, please cite as:

@article{Code2DFD23,
  title = {Automatic Extraction of Security-Rich Dataflow Diagrams for Microservice Applications written in Java},
  journal = {Journal of Systems and Software},
  volume = {202},
  pages = {111722},
  year = {2023},
  issn = {0164-1212},
  doi = {https://doi.org/10.1016/j.jss.2023.111722},
  author = {Simon Schneider and Riccardo Scandariato},
  keywords = {Dataflow diagram, Automatic extraction, Security, Microservices, Architecture reconstruction, Feature detection}
}
1. Installation and configuration

Before running the tool, Python version 3.x and the packages specified in requirements.txt need to be installed. The path to the application that is to be analysed can be written in the config/config.ini file or given as parameter (see 2.). A number of repositories is already given in that file, for all of which a manually created DFD exists here. The corresponding path only needs to be un-commented for analysis (all others have to be commented out with a ";")

2. Running the tool

To start the tool via the terminal using the config file, simply enter python3 code2DFD.py --config_path PATH_TO_CONFIG in a command line opened in the root directory. For example, python3 code2DFD.py --config_path config/config.ini for the example config in this repository.

The config file needs to specify the following sections and parameters:

It is possible to provide these parameters also by command line, see python3 code2DFD.py --help for exact usage

If both config file and CLI arguments provided, CLI arguments take precedence

2.1 RESTful service

To run the tool as a RESTful API service, run python3 flask_code2DFD.py.

This will spawn up a Flask server and you can trigger DFD-extractions by sending a request to localhost:5001/dfd with parameters url and optionally commit.

Currently only GitHub URLs are supported this way.

3. Output

The tools puts the PROJECT analysis output into code2DFD_output/PROJECT The tool creates multiple outputs: