src-d / awesome-machine-learning-on-source-code

Cool links & research papers related to Machine Learning applied to source code (MLonCode)
Creative Commons Attribution Share Alike 4.0 International
6.26k stars 843 forks source link
awesome awesome-list machine-learning machine-learning-on-source-code

Awesome Machine Learning On Source Code Awesome Machine Learning On Source Code CI Status

Awesome Machine Learning On Source Code

Notice: This repository is no longer actively maintained, and no further updates will be done, nor issues/PRs will be answered or attended. An alternative actively maintained can be found at ml4code.github.io repository.

A curated list of awesome research papers, datasets and software projects devoted to machine learning and source code. #MLonCode

Contents

Digests

Conferences

Competitions

Papers

Program Synthesis and Induction

Source Code Analysis and Language modeling

Neural Network Architectures and Algorithms

Embeddings in Software Engineering

Program Translation

Code Suggestion and Completion

Program Repair and Bug Detection

APIs and Code Mining

Code Optimization

Topic Modeling

Sentiment Analysis

Code Summarization

Clone Detection

Differentiable Interpreters

Related research #### AST Differencing - 12-pages [ClDiff: Generating Concise Linked Code Differences](https://chenbihuan.github.io/paper/ase18-huang-cldiff.pdf) - Kaifeng Huang, Bihuan Chen, Xin Peng, Daihong Zhou, Ying Wang, Yang Liu, Wenyun Zhao, ASE 2018. [Code](https://github.com/FudanSELab/CLDIFF). - 11-pages [Generating Accurate and Compact Edit Scripts Using Tree Differencing](http://www.xifiggam.eu/wp-content/uploads/2018/08/GeneratingAccurateandCompactEditScriptsusingTreeDifferencing.pdf) - Veit Frick, Thomas Grassauer, Fabian Beck, Martin Pinzger, ICSME 2018. - 11-pages [Fine-grained and Accurate Source Code Differencing](https://hal.archives-ouvertes.fr/hal-01054552/document) - Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, Martin Monperrus, ASE 2014. #### Binary Data Modeling - [Clustering Binary Data with Bernoulli Mixture Models](https://nsgrantham.com/documents/clustering-binary-data.pdf) - Neal S. Grantham. - [A Family of Blockwise One-Factor Distributions for Modelling High-Dimensional Binary Data](https://arxiv.org/pdf/1511.01343.pdf) - Matthieu Marbac and Mohammed Sedki, Computational Statistics & Data Analysis 2017. - [BayesBinMix: an R Package for Model Based Clustering of Multivariate Binary Data](https://arxiv.org/pdf/1609.06960.pdf) - Panagiotis Papastamoulis and Magnus Rattray, R Journal 2016. #### Soft Clustering Using T-mixture Models - [Robust mixture modelling using the t distribution](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.218.7334&rep=rep1&type=pdf) - D. Peel and G. J. McLachlan, Statistics and Computing 2000. - [Robust mixture modeling using the skew t distribution](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1030.9865&rep=rep1&type=pdf) - Tsung I. Lin, Jack C. Lee and Wan J. Hsieh, Statistics and Computing 2010. #### Natural Language Parsing and Comprehension - 11-pages [A Fast Unified Model for Parsing and Sentence Understanding](https://arxiv.org/abs/1603.06021) - Samuel R. Bowman, Jon Gauthier, Abhinav Rastogi, Raghav Gupta, Christopher D. Manning, Christopher Potts, ACL 2016.

Posts

Talks

Software

Machine Learning

Utilities

Datasets

Credits

Contributions

See CONTRIBUTING.md. TL;DR: create a pull request which is signed off.

License

License: CC BY-SA 4.0