tech-srl / code2vec

TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"
https://code2vec.org
MIT License
1.1k stars 286 forks source link

bias-variance tradeoff #175

Open Avv22 opened 1 year ago

Avv22 commented 1 year ago

Hello Code2Vec Team,

You mentioned in your paper Code2Vec that there are two main challenges in your work: 1) how you can decompose a program into smaller building blocks such that:

  1. large enough to be meaningful and
  2. small enough to repeat across programs.

You then defined the previous 2 points as a bias-variance tradeoff. Can you please explain this idea more?

urialon commented 1 year ago

Hi @Avv22 , Thank you for your interest in our work, and thank you for reading the paper carefully. This is a great question!

The bias-variance trade-off is a general concept in machine learning, that expresses the main problem in designing features or representations in machine learning models.

Too specific features may describe the training data very well, but can cause overfitting; too general/simple features may occur across many examples, but can be insufficiently expressive.

For more information, see the Wikipedia article for example: https://en.m.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff

Best, Uri