统计模型 - Githubissues

coderZMR commented 5 years ago

Nguyen A T, Hilton M, Codoban M, et al. API code recommendation using statistical learning from fine-grained changes. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, Seattle, 2016, 511-522 在项目中的大量变更属于意图模式的变更比特定于项目的变更更频繁出现这一前提条件下，提出了一个新的方向，使用统计学习技术综合利用代码上下文和细粒度的代码变更的重复性实现代码补全，细粒度的变更所包含的编码模式不需要有严格的顺序，代码变更处的上下文为要推荐的代码起到了提示作用。

coderZMR commented 5 years ago

Raychev V, Vechev M T, Yahav E. Code completion with statistical language models. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Edinburgh, 2014, 419-428 基本思想是将代码补全问题简化为预测句子概率的自然语言处理问题，输入是带有窟窿的不完整代码，使用语言模型学习到的概率补全窟窿。

Fatead commented 5 years ago

Nguyen T T, Nguyen A T, Nguyen H A, et al. A statistical semantic language model for source code, In: Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE’13, Saint Petersburg, Russian Federation, August 18-26, 2013, 532–542. 现有的基于n-gram的统计语言模型仅仅能够利用代码中的文本信息，为了提高预测的准确度，作者提出了名为SLAMC的统计语言模型，他们的模型利用到了代码中的语义信息，相比于仅仅利用文本信息的统计语言模型实现了更好的预测准确度。

Fatead commented 5 years ago

Bruch M , Monperrus M , Mezini M . Learning from Examples to Improve Code Completion Systems. In: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2009, Amsterdam, The Netherlands, August 24-28, 2009, 231-222 作者提出三种新的方法用于代码补全，分别于基于频率的代码补全系统，基于关联规则的代码补全系统和基于KNN的代码补全系统，很好的解决了传统代码补全的完全基于编程语言的问题。

Fatead commented 5 years ago

Thung F , Wang S , Lo D , et al. Automatic recommendation of API methods from feature requests. In: 2013 IEEE/ACM 28th International Conference on Automated Software Engineering (ASE), Silicon Valley, USA, 2013, 290-300. 作者通过从用户输入的文本描述中进行特征抽取，利用当前抽取的特征和从储存的历史特征来进行代码推荐。

yanqianyu commented 5 years ago

Bielik P, Raychev V, Vechev M. PHOG: Probabilistic Model for Code. In: Proceedings of the 33nd International Conference on Machine Learning, New York City, 2016, 2933--2942 提出了针对代码的PHOG模型，通过允许引入抽象语法树上非叶节点的信息扩展PCFG，捕捉到与代码相关的丰富的上下文，应用于js代码的补全。

wjwen23 commented 5 years ago

Nguyen A T, Nguyen T N. Graph-based statistical language model for code. In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering. IEEE, 2015. 858-868. 本文首先利用GrouMiner对源代码进行处理构造成groum，然后对groum抽取出所有的子图和父图，并记录下推荐的节点。然后利用贝叶斯公式，利用统计学习方法进行推荐。

yanqianyu commented 5 years ago

Raychev V, Bielik P, Vechev M, et al. Learning programs from noisy data}. In: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, St. Petersburg, 2016, 761--774 针对有噪声的输入输出样例数据集，提出了正则化代码生成器和数据集采样器，可以从整个数据集的一个采样子集上生成候选代码，并在采样子集上影响候选代码的得分。

yanqianyu commented 5 years ago

Tu Z P, Su Z D, Devanbu P. On the localness of software. In: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong, 2014, 269--280 给定上下文，预测下一个token，在n-gram模型的基础上添加cache组件捕获代码的本地规则，提高代码推荐的准确率。

yanqianyu commented 5 years ago

Hindle A, Barr E, Su Z D, et al. On the naturalness of software. In: 34th International Conference on Software Engineering, Zurich, 2012, 837--847 将n-gram模型运用于软件语料上，捕捉代码中高层的统计信息，对java代码进行代码推荐。

Fatead commented 5 years ago

Pham H V, Vu P M, Nguyen T T. Learning API usages from bytecode: a statistical approach. In: Proceedings of the 38th International Conference on Software Engineering, 2016, 416-427 作者提出HAPI的方法，从应用字节码中抽取API调用序列，利用隐马尔可夫模型进行统计性代码推荐工作，实现了对移动应用开发时的API推荐工作。

Fatead commented 5 years ago

Bruch M , Monperrus M , Mezini M . Learning from Examples to Improve Code Completion Systems. In: Proceedings of the 7th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2009, Amsterdam, The Netherlands, August 24-28, 2009, 231-222 作者提出三种新的方法用于代码补全，分别于基于频率的代码补全系统，基于关联规则的代码补全系统和基于KNN的代码补全系统，很好的解决了传统代码补全的完全基于编程语言的问题。

提出了三种新的方法用于代码补全，分别于基于频率的代码补全系统，基于关联规则的代码补全系统和基于KNN的代码补全系统，基于频率的方法为用户推荐出数据库中出现频率最高的代码，基于关联规则的方法是若方法A和B之间相互关联，且方法A被调用，则为用户推荐方法B，基于KNN的方法是对于代码中的方法进行聚类，当该类的某一方法被调用时，为用户推荐聚类中的其他方法。

Fatead commented 5 years ago

Thung F , Wang S , Lo D , et al. Automatic recommendation of API methods from feature requests. In: 2013 IEEE/ACM 28th International Conference on Automated Software Engineering (ASE), Silicon Valley, USA, 2013, 290-300. 作者通过从用户输入的文本描述中进行特征抽取，利用当前抽取的特征和从储存的历史特征来进行代码推荐。

提出通过从用户输入的文本描述中进行特征抽取，利用当前抽取的特征和许多API方法的文本描述进行对比，利用对比的结果为用户推荐出能够实现特征描述的方法。

Fatead commented 5 years ago

Pham H V, Vu P M, Nguyen T T. Learning API usages from bytecode: a statistical approach. In: Proceedings of the 38th International Conference on Software Engineering, 2016, 416-427 作者提出HAPI的方法，从应用字节码中抽取API调用序列，利用隐马尔可夫模型进行统计性代码推荐工作，实现了对移动应用开发时的API推荐工作。

提出HAPI的方法，从应用字节码中抽取API调用序列，利用隐马尔可夫模型进行统计性代码推荐工作，一个隐马尔可夫链表示一个包含了多个API对象的API调用序列，通过马尔可夫链的状态转换实现了对移动应用开发时的API推荐工作。

coderZMR / CodeRecommendSynthesis

统计模型 #4