Open vmarkovtsev opened 6 years ago
Good idea!
What are the criteria for labeling something beginner
?
Maybe which required basic knowledge of ML Math and medium or less understanding ML Tech?
@marnovo @osanwe It should be friendly to people who have just started exploring MLonCode or do not want to spend much time in order to understand the paper.
That is hard to formalize; whenever someone recommends us to add this label or vice versa, we attentively consider doing so.
I really like the idea to add the conference where a paper is published, since not all of the papers are quality, it's likely that the papers that are published in the top tier conferences have better quality than the lower tier conferences
There is another proposal then: remove the papers which are considered not awesome enough since this list is "awesome". There is no goal to catch them all.
@marnovo @osanwe It should be friendly to people who have just started exploring MLonCode or do not want to spend much time in order to understand the paper.
That is hard to formalize; whenever someone recommends us to add this label or vice versa, we attentively consider doing so.
@vmarkovtsev agreed it's hard to formalize, but would be helpful to have some yardstick heuristics to standardize the process (and the outcome).
Another idea that could maybe be easier to implement and sounds less judgemental: in a similar fashion that GitHub introduced the "good first issue" label for helping beginners to find their when contributing to a project, we could instead of "beginner" mark papers as "good first read" or "good intro paper".
I really like the idea to add the conference where a paper is published, since not all of the papers are quality, it's likely that the papers that are published in the top tier conferences have better quality than the lower tier conferences
@bdqnghi agreed with Vadim. This sounds like a different proposal, I'd invite you to open it as a new separate issue so we can discuss it over there. Thanks!
@marnovo "beginner" is exactly a shorter "good first read". The latter is too long and occupies much space thus we decided with @campoy to name it "beginner".
@vmarkovtsev,
I think, for example, the paper "A Survey of Machine Learning for Big Code and Naturalness" by Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, Charles Sutton is a very nice introduction to MLonCode but is it a bad idea to mark this paper with beginner
label.
Agreed
Why is beginner
a bad label for the paper?
Beginner doesn't imply it's a bad paper, but that it's a great place for you to begin reading on the topic.
@campoy I believe the point is that the paper is far from "easy", even though it might be very good quality or a good introduction to the topic.
To give a bit more color on what I mentioned previously: as much as I don't think beginner
is terrible, it seems way more judgemental than good-first
or good-intro
, for instance.
E.g.: What does it mean, really? Does beginner mean you're a beginner in ML, in source code analysis, or in MLonCode… or all of them, any of them? If one reads a paper marked for "beginners" and is barely able to understand it (as seems the case of the aforementioned paper for most), how should them feel, bad?
In the end you have different dimensions to judge a paper on here. E.g.:
All this considering the myriad of profiles of people that come to the repo… so a good intro to ML on Code paper doesn't mean it is easy, as the way around. This is why I'd rather have the concept better scoped and defined, so it is more consistent and readers know what to expect; maybe even have more than one label if we eventually need.
We should rather add the second label "intro" which is easy to assign and document what is the "beginner" because even our PM thinks that it tries to judge while it completely does not :)
"beginner" does not take into account the quality (bad quality papers are not a part of an awesome list), topic, suitability (all the papers must be suitable to MLonCode, otherwise we need to delete them). Only the last point holds. And it is by def very subjective so until we've got active voting users we will continue assigning "beginner" based on our complex internal feelings and emotional biases.
My complex internal feelings and emotional biases don't care that much about what label we use, tbh. Beginner or intro work, I will not push one way or the other.
Deal.
Decision made: change the "beginner" label to "introduction"
Furthermore, if I may, I'd like to add the conferences the papers are submitted to. I like this info because it always gives me a quick insight about the paper's quality/style