memgraph / mage

MAGE - Memgraph Advanced Graph Extensions :crystal_ball:
Apache License 2.0
259 stars 27 forks source link
cypher graph-algorithms graph-database memgraph open-source real-time-analytics stream-processing


MAGE
MAGE

Memgraph Advanced Graph Extensions :crystal_ball:

This open-source repository contains all available user-defined graph analytics modules and procedures that extend the Cypher query language, written by the team behind Memgraph and its users. You can find and contribute implementations of various algorithms in multiple programming languages, all runnable inside Memgraph. This project aims to give everyone the tools they need to tackle the most challenging graph problems.

Introduction to query modules with MAGE

Memgraph introduces the concept of query modules, user-defined procedures that extend the Cypher query language. These procedures are grouped into modules that can be loaded into Memgraph. How to run them can be seen on their official documentation. When started, Memgraph will automatically attempt to load the query modules from all *.so and *.py files it finds in the default directory defined with flag --query-modules-directory.

Further reading

If you want more info about MAGE, check out the official MAGE Documentation.

Algorithm proposition

Furthermore, if you have an algorithm proposition, please fill in the survey on mage.memgraph.com.

Community

Make sure to check out the Memgraph community and join us on a survey of streaming graph algorithms! Drop us a message on the channels below:

Follow @memgraphmage Discourse forum Discord Memgraph Github Memgraph YouTube

Overview

Memgraph compatibility

With changes in Memgraph API, MAGE started to track version numbers. The table below lists the compatibility of MAGE with Memgraph versions. MAGE version Memgraph version
>= 1.11.9 >= 2.11.0
>= 1.5.1 >= 2.5.1
>= 1.5 >= 2.5.0
>= 1.4 >= 2.4.0
>= 1.0 >= 2.0.0
^0 >= 1.4.0 <= 1.6.1

How to install MAGE?

There are two options to install MAGE.

1) For the Docker installation, you only need Docker installed.

2) To build from source, you will need to install a few things first. Jump to section #2 to check for installation details.

After the installation, you will be ready to query Memgraph and use MAGE modules. Make sure to have one of the querying platforms installed as well.

1. Use MAGE with Docker

a) Get MAGE from Docker Hub

1. This command downloads the image from Docker Hub and runs Memgraph preloaded with MAGE modules:

docker run -p 7687:7687 -p 7444:7444 memgraph/memgraph-mage

2 Install MAGE with Docker build of the repository

0. a Make sure that you have cloned the MAGE Github repository and positioned yourself inside the repo in your terminal:

git clone --recurse-submodules https://github.com/memgraph/mage.git && cd mage

0. b Download Memgraph from our official download site inside your cloned MAGE repository. Set ${MEMGRAPH_VERSION} to the latest release of Memgraph, and ${ARCHITECTURE} to your system architecture (amd64 or arm64):


curl -L "https://download.memgraph.com/memgraph/v${MEMGRAPH_VERSION}/debian-11/memgraph_${MEMGRAPH_VERSION}-1_${ARCHITECTURE}.deb" > memgraph-${ARCHITECTURE}.deb

or this one if you are on arm64:

curl -L "https://download.memgraph.com/memgraph/v${MEMGRAPH_VERSION}/debian-11-aarch64/memgraph_${MEMGRAPH_VERSION}-1_arm64.deb" > memgraph-arm64.deb

1. To build the MAGE image run the following command where you set ${architecture} to your system architecture (amd64 or arm64):

DOCKER_BUILDKIT=1 docker buildx build \
--tag memgraph-mage:prod \
--target prod \
--platform linux/${architecture} \
--file Dockerfile.release \
--load .

This will build any new algorithm added to MAGE, and load it inside Memgraph.

2. Start the container with the following command and enjoy Memgraph with MAGE:

docker run --rm -p 7687:7687 -p 7444:7444 --name mage memgraph-mage

NOTE: if you made any changes while the MAGE Docker container was running, you will need to stop it and rebuild the whole image, or you can copy the mage directory inside the Docker container and do the rebuild from there. To learn more about development with MAGE and Docker, visit the documentation.

2. Installing MAGE on Linux distro from source

Note: This step is more suitable for local development.

Prerequisites

Since Memgraph needs to load MAGE's modules, there is the setup script to help you. With it, you can build the modules so that Memgraph can load them on start up.

Before you start, don't forget to clone MAGE with --recurse-submodules flag:

git clone --recurse-submodules https://github.com/memgraph/mage.git && cd mage

Run the following command to install Rust and Python dependencies:

curl https://sh.rustup.rs -sSf | sh -s -- -y \
&& export PATH="/root/.cargo/bin:${PATH}" \
&& python3 -m  pip install -r /mage/python/requirements.txt \
&& python3 -m  pip install -r /mage/python/tests/requirements.txt \
&& python3 -m  pip install torch-sparse torch-cluster torch-spline-conv torch-geometric torch-scatter -f https://data.pyg.org/whl/torch-1.12.0+cu102.html \

Now you can run the following command to compile and copy the query modules to the /usr/lib/memgraph/query_modules path:

python3 setup build -p /usr/lib/memgraph/query_modules

It will generate a mage/dist directory and copy the modules to the /usr/lib/memgraph/query_modules directory.

Note that query modules are loaded into Memgraph on startup so if your instance was already running you will need to execute the following query inside one of the querying platforms to load them: CALL mg.load_all();

Running MAGE

This is an example of running the PageRank algorithm on a simple graph. You can find more details on the documentation page.

// Create the graph from the image below

CALL pagerank.get()
YIELD node, rank;
Graph input MAGE output
graph_input graph_output

MAGE Spells

Algorithms Lang Description
betweenness_centrality C++ The betweenness centrality of a node is defined as the sum of the of all-pairs shortest paths that pass through the node divided by the number of all-pairs shortest paths in the graph. The algorithm has O(nm) time complexity.
betweenness_centrality_online C++ The betweenness centrality of a node is defined as the sum of the of all-pairs shortest paths that pass through the node divided by the number of all-pairs shortest paths in the graph. The algorithm has O(nm) time complexity.
biconnected_components C++ An algorithm for calculating maximal biconnected subgraph. A biconnected subgraph is a subgraph with a property that if any vertex were to be removed, the graph will remain connected.
bipartite_matching C++ An algorithm for calculating maximum bipartite matching, where matching is a set of nodes chosen in such a way that no two edges share an endpoint.
bridges C++ A bridge is an edge, which when deleted, increases the number of connected components. The goal of this algorithm is to detect edges that are bridges in a graph.
community_detection C++ The Louvain method for community detection is a greedy method for finding communities with maximum modularity in a graph. Runs in O(nlogn) time.
community_detection_online C++ A dynamic community detection algorithm suitable for large-scale graphs based upon label propagation. Runs in O(m) time and has O(mn) space complexity.
cycles C++ Algorithm for detecting cycles on graphs
cugraph CUDA Collection of NVIDIA GPU-powered algorithms integrated in Memgraph. Includes centrality measures, link analysis and graph clusterings.
distance_calculator Python Module for finding the geographical distance between two points defined with 'lng' and 'lat' coordinates.
export_util Python A module for exporting the graph database in different formats (JSON).
graph_analyzer Python This Graph Analyzer query module offers insights about the stored graph or a subgraph.
graph_coloring Python An algorithm for assigning labels to the graph elements subject to certain constraints. In this form, it is a way of coloring the graph vertices such that no two adjacent vertices are of the same color.
graph_util C++ A module with common graph utility procedures in day-to-day operations with graphs.
igraph Python A module that provides igraph integration with Memgraph and implements igraph algorithms
import_util Python A module for importing data generated by the export_util().
json_util Python A module for loading JSON from a local file or remote address.
katz_centrality C++ Katz centrality is a centrality measurement that outputs a node's influence based on the number of shortest paths and their weighted length.
katz_centrality_online C++ Online implementation of the Katz centrality. Outputs the approximate result for Katz centrality while maintaining the order of rankings.
kmeans Python An algorithm for clustering given data.
leiden_community_detection C++ The Leiden method for community detection is an improvement on the Louvain method, designed to find communities with maximum modularity in a graph while addressing issues of disconnected communities. Runs in O(L m) time, where L is the number of iterations of the algorithm.
link_prediction_gnn Python A module for predicting links in graphs using graph neural networks.
llm_util Python A module that contains procedures describing graphs in a format best suited for large language models (LLMs).
max_flow Python An algorithm for calculating maximum flow through a graph using capacity scaling
meta_util Python A module that contains procedures describing graphs on a meta-level.
node_classification_with_gnn Python A graph neural network-based node classification module.
node2vec Python An algorithm for calculating node embeddings from static graphs.
node2vec_online Python An algorithm for calculating node embeddings as new edges arrive
node_similarity Python A module that contains similarity measures for calculating the similarity between two nodes.
nxalg Python A module that provides NetworkX integration with Memgraph and implements many NetworkX algorithms
pagerank C++ An algorithm that yields the influence measurement based on the recursive information about the connected nodes influence
pagerank_online C++ A dynamic algorithm made for calculating PageRank in a graph streaming scenario.
rust_example Rust Example of a basic module with input parameters forwarding, made in Rust.
set_cover Python The algorithm for finding minimum cost subcollection of sets that covers all elements of a universe.
set_property C++ Utility module to help dynamically set properties on nodes and relationships.
temporal_graph_networks Python GNN temporal graph algorithm to predict links or do node classification.
tsp Python An algorithm for finding the shortest possible route that visits each vertex exactly once.
union_find Python A module with an algorithm that enables the user to check whether the given nodes belong to the same connected component.
uuid_generator C++ A module that generates a new universally unique identifier (UUID).
vrp Python An algorithm for finding the shortest route possible between the central depot and places to be visited. The algorithm can be solved with multiple vehicles that represent a visiting fleet.
weakly_connected_components C++ A module that finds weakly connected components in a graph.

Advanced configuration

Testing MAGE

To test that everything is built, loaded, and working correctly, a python script can be run. Make sure that the Memgraph instance with MAGE is up and running.

# Running unit tests for C++ and Python
python3 test_unit

# Running end-to-end tests
python3 test_e2e

Furthermore, to test only specific end-to-end tests, you can add argument -k with substring referring to the algorithm that needs to be tested. To test a module named <query_module>, you would have to run python3 test_e2e -k <query_module> where <query_module> is the name of the specific module you want to test.

# Running specific end-to-end tests
python3 test_e2e -k weakly_connected_components

Contributing

We encourage everyone to contribute with their own algorithm implementations and ideas. If you want to contribute or report a bug, please take a look at the contributions guide.

Code of Conduct

Everyone participating in this project is governed by the Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to tech@memgraph.com.

Feedback

Your feedback is always welcome and valuable to us. Please don't hesitate to post on our Community Forum.