src-d / reading-club

Paper reading club at source{d}
Creative Commons Attribution Share Alike 4.0 International
115 stars 12 forks source link

Next paper candidates: 31 May #53

Closed bzz closed 5 years ago

bzz commented 5 years ago

Next paper candidates

Let's propose papers to study next! All papers mentioned in the comments of this issue will be listed in the next vote.

Last session runner-up(s)

We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).

bzz commented 5 years ago

"Neural Networks for Modeling Source Code Edits" by GoogleAI, ICLR 2019

Programming languages are emerging as a challenging and interesting domain for machine learning. A core task, which has received significant attention in recent years, is building generative models of source code. However, to our knowledge, previous generative models have always been framed in terms of generating static snapshots of code. In this work, we instead treat source code as a dynamic object and tackle the problem of modeling the edits that software developers make to source code files. This requires extracting intent from previous edits and leveraging it to generate subsequent edits. We develop several neural networks and use synthetic data to test their ability to learn challenging edit patterns that require strong generalization. We then collect and train our models on a large-scale dataset of Google source code, consisting of millions of fine-grained edits from thousands of Python developers. From the modeling perspective, our main conclusion is that a new composition of attentional and pointer network components provides the best overall performance and scalability. From the application perspective, our results provide preliminary evidence of the feasibility of developing tools that learn to predict future edits.

bzz commented 5 years ago

"Graph Matching Networks for Learning the Similarity of Graph Structured Objects" by DeepMind/Google at ICML 2019

This paper addresses the challenging problem of retrieval and matching of graph structured objects, and makes two key contributions. First, we demonstrate how Graph Neural Networks (GNN), which have emerged as an effective model for various supervised prediction problems defined on structured data, can be trained to produce embedding of graphs in vector spaces that enables efficient similarity reasoning. Second, we propose a novel Graph Matching Network model that, given a pair of graphs as input, computes a similarity score between them by jointly reasoning on the pair through a new cross-graph attention-based matching mechanism. We demonstrate the effectiveness of our models on different domains including the challenging problem of control-flow-graph based function similarity search that plays an important role in the detection of vulnerabilities in software systems. The experimental analysis demonstrates that our models are not only able to exploit structure in the context of similarity learning but they can also outperform domain-specific baseline systems that have been carefully hand-engineered for these problems.

bzz commented 5 years ago

Another source code edits modeling paper from ICLR:

"Learning to Represent Edits" an MSR paper from ICLR'19:

We introduce the problem of learning distributed representations of edits. By com- bining a “neural editor” with an “edit encoder”, our models learn to represent the salient information of an edit and can be used to apply edits to new inputs. We experiment on natural language and source code edit data. Our evaluation yields promising results that suggest that our neural network models learn to capture the structure and semantics of edits. We hope that this interesting task and data source will inspire other researchers to work further on this problem.

fyzbt commented 5 years ago

two of them are identical (on source code edits)

bzz commented 5 years ago

two of them are identical (on source code edits)

Thank you @fyzbt - good catch!

Ofc it was not supposed to be that way - there are 2 recent papers on learning sources code edits both presented at ICLR, that were meant to be listed! Update the https://github.com/src-d/reading-club/issues/53#issuecomment-494377071 to include MSR one.

bzz commented 5 years ago

Calling for a vote in community Slack #reading-club.

bzz commented 5 years ago

The poll is up. And https://github.com/src-d/reading-club/issues/61 is for planing further sessions.

bzz commented 5 years ago

Fixed by #62 - the winners are 2 source code edits papers.