BoChenGroup / PyDPM

A Python Library for Deep Probabilistic Models
Apache License 2.0
172 stars 35 forks source link


[![GitHub](https://img.shields.io/github/license/awslabs/gluon-ts.svg?style=flat-square)](./LICENSE) [![PyPI](https://img.shields.io/badge/pypi-v5.1.0-blue.svg)](https://pypi.org/project/pydpm/) ![Python version](https://img.shields.io/badge/python-3.7%20%7C%203.8%20%7C%203.9-blue.svg) [![Documentation Status](https://readthedocs.org/projects/quantus/badge/?version=latest)](https://dustone-mu.github.io/) [![Stars][stars-image]][stars-url] [![Downloads](https://pepy.tech/badge/pydpm)](https://pepy.tech/project/pydpm) [![Contributing][contributing-image]][contributing-url]

A python library focuses on constructing Deep Probabilistic Models (DPMs). Our developed Pydpm not only provides efficient distribution sampling functions on GPU, but also has included the implementations of existing popular DPMs.

Documentation | [Paper [Arxiv]]() | Tutorials | Benchmarks | Examples |

News

:fire:A new version that does not depend on Pycuda has been released.

:fire:An abundance of professional learning materials on Deep Generative Models from the Ermom's group at Stanford University. (CS236 - Fall 2021)

:fire:A tutorial of DPMs has been uploaded by Prof. Wilker Aziz (University of Amsterdam).

Install

The current version of PyDPM can be installed under either Windows or Linux system with PyPI.

$ pip install pydpm

For Windows system, we recommed to install Visual Studio 2019 as the compiler equipped with CUDA 11.5 toolkit; For Linux system, we recommed to install the latest version of CUDA toolkit.

The enviroment for testing has been released for easily reproducing our results.

$ conda env create -f enviroment.yaml

Overview

The overview of the framework of PyDPM library can be roughly split into four sectors, specifically Sampler, Model, Evaluation, and Example modules, which have been illustrated as follows: 1) Sampler module includes both parts of the basic Distribution Sampler and the sophisticate Model Sampler, which can effectively accomplish the sampling requirements of these DPMs constructed on either CPU or GPU; 2) Model module contains a wide variety of classical and popular DPMs, which can be directly called as APIs in Python; 3) Evaluation module provides a DataLoader sub-module to process data samples in various forms, such as images, text, graphs etc., and also a Metric sub-module to comprehensively evaluate these DPMs after training; 4) Example module, for each DPM included in the Model module, we provides a corresponding code demo equipped with a detailed explanation in the official docs.

The workflow of applying PyDPM for downstream tasks, which can be splited into four steps as follows: 1) Device deployment of pyDPM can be choose as a platform with either CPU or GPU; 2) Mechasnisms of model training or testing includes either or both of Gibbs sampling and back propagation, implemented by pyDPM.sampler and pyTorch respecitveily; 3) Model categories in pyDPM mainly include Bayesian Probabilistic Model, Deep-Learning Probabilistic Models, and Hybrid Probabilistic Models; 4) Applications of DPMs has included Nature Language Processing (NLP), Graph Neural Network (GNN), and Recommendation System (RS) etc.

Model List

The Model module in pyDPM has included a wide variety of popular DPMs, which can be roughly split into several categories, including Bayesian Probabilistic Model, Deep-Learning Probabilistic Models, and Hybrid Probabilistic Models.

Bayesian Probabilistic Models

      Probabilistic Model Name       Abbreviation    Paper Link   
Latent Dirichlet Allocation LDA Blei et al., 2003
Poisson Factor Analysis PFA Zhou et al., 2012
Poisson Gamma Belief Network PGBN Zhou et al., 2015
Convolutional Poisson Factor Analysis CPFA Wang et al., 2019
Convolutional Poisson Gamma Belief Network CPGBN Wang et al., 2019
Factor Analysis FA
Gaussian Mixed Model GMM
Poisson Gamma Dynamical Systems PGDS Zhou et al., 2016
Deep Poisson Gamma Dynamical Systems DPGDS Guo et al., 2018
Dirichlet Belief Networks DirBN Zhao et al., 2018
Deep Poisson Factor Analysis DPFA Gan et al., 2015
Word Embeddings Deep Topic Model WEDTM Zhao et al., 2018
Multimodal Poisson Gamma Belief Network MPGBN Wang et al., 2018
Graph Poisson Gamma Belief Network GPGBN Wang et al., 2020

Deep-Learning Probabilistic Models

      Probabilistic Model Name       Abbreviation    Paper Link   
Restricted Boltzmann Machines RBM Hinton et al., 2010
Variational Autoencoder VAE Kingma et al., 2014
Generative Adversarial Network GAN Goodfellow et al., 2014
Density estimation using Real NVP RealNVP (2d) Dinh et al., 2017
Denoising Diffusion Probabilistic Models DDPM Ho et al., 2020
Density estimation using Real NVP RealNVP (image) Dinh et al., 2018
Conditional Variational Autoencoder CVAE Sohn et al., 2015
Deep Convolutional Generative Adversarial Networks DCGAN Radford et al., 2016
Wasserstein Generative Adversarial Networks WGAN Arjovsky et al., 2017
Information Maximizing Generative Adversarial Nets InfoGAN Xi Chen et al., 2016

Hybrid Probabilistic Models

      Probabilistic Model Name       Abbreviation    Paper Link   
Weibull Hybrid Autoencoding Inference WHAI Zhang et al., 2018
Weibull Graph Attention Autoencoder WGAAE Wang et al., 2020
Recurrent Gamma Belief Network rGBN Guo et al., 2020
Multimodal Weibull Variational Autoencoder MWVAE Wang et al., 2020
Sawtooth Embedding Topic Model SawETM Duan et al., 2021
TopicNet TopicNet Duan et al., 2021
Deep Coupling Embedding Topic Model dc-ETM Li et al., 2022
Topic Taxonomy Mining with Hyperbolic Embedding HyperMiner Xu et al., 2022
Knowledge Graph Embedding Topic Model KG-ETM Wang et al., 2022
Variational Edge Parition Model VEPM He et al., 2022
Generative Text Convolutional Neural Network GTCNN Wang et al., 2022

Deep Proabilistic Models planned to be built

:fire:Welcome to introduce classical or novel Deep Proabilistic Models for us.       Probabilistic Model Name       Abbreviation    Paper Link   
Nouveau Variational Autoencoder NVAE Vahdat et al., 2020
flow-based Variational Autoencoder f-VAE Su et al., 2018
Score-Based Generative Models SGM Bortoli et al., 2022
Poisson Flow Generative Models PFGM Xu et al., 2022
Stable Diffusion LDM Rombach et al., 2022
Denoising Diffusion Implicit Models DDIM Song et al., 2022
Vector Quantized Diffusion VQ-Diffusion Tang et al., 2023
Vector Quantized Variational Autoencoder VQ-VAE Aaron van den Oord et al., 2017
Conditional Generative Adversarial Nets cGAN Mirza et al., 2014
Information Maximizing Variational Autoencoders InfoVAE zhao et al.,2017
Generative Flow Glow Kingama et al., 2018
Structured Denoising Diffusion Models in Discrete State-Spaces DP3M Austin et al., 2021

Usage

Example: a few code lines to quickly construct and evaluate a 3-layer Bayesian model named PGBN on GPU.

from pydpm.model import PGBN
from pydpm.metric import ACC

# create the model and deploy it on gpu or cpu
model = PGBN([128, 64, 32], device='gpu')
model.initial(train_data)
train_local_params = model.train(train_data, iter_all=100)
train_local_params = model.test(train_data, iter_all=100)
test_local_params = model.test(test_data, iter_all=100)

# evaluate the model with classification accuracy
# the demo accuracy can achieve 0.8549
results = ACC(train_local_params.Theta[0], test_local_params.Theta[0], train_label, test_label, 'SVM')

# save the model after training
model.save()

Example: a few code lines to quickly deploy distribution sampler of Pydpm on GPU.

from pydpm.sampler import Basic_Sampler

sampler = Basic_Sampler('gpu')
a = sampler.gamma(np.ones(100)*5, 1, times=10)
b = sampler.gamma(np.ones([100, 100])*5, 1, times=10)

Compare

Compare the distribution sampling efficiency of PyDPM with numpy:

Compare the distribution sampling efficiency of PyDPM with tensorflow and torch:

Compare the distribution sampling efficiency of PyDPM with CuPy and PyCUDA(used by pydpm v1.0):

Contact

License: Apache License Version 2.0

Contact: Chaojie Wang xd_silly@163.com, Xinyang Liu lxy771258012@163.com, Wei Zhao 13279389260@163.com, Bufeng Ge 20009100138@stu.xidian.edu.cn, Jiawen Wu wjw19960807@163.com

Copyright (c), 2020, Chaojie Wang, Wei Zhao, Xinyang Liu, Jiawen Wu, Jie Ren, Yewen Li, Hao Zhang, Bo Chen and Mingyuan Zhou