Please please add ICLR - Githubissues

emeryberger / CSrankings

A web app for ranking computer science departments according to their research output in selective venues, and for finding active faculty across a wide range of areas.

http://csrankings.org

Other

2.69k stars 3.17k forks source link

Please please add ICLR #4683

Closed wpzdm closed 1 year ago

wpzdm commented 2 years ago

Hi,

CSranknigs is a great tool and helped a lot when finding my PhD. I had already noticed there was no ICLR back then (c. 2017~2018). Now, I'd like to use it for my post-doc finding, but ICLR is still missing! I think this is a major drawback as of 2022, and I prefer http://csmetrics.org/ for my purpose now. As mentioned in the many other issues (e.g., #4442, #3518, #240), ICLR, among Neurips and ICML, is a top conference in AI&ML, and even has a higher h5-index than the other two! I really hope this problem could be fixed for such a great tool...Now, there is a PR (https://github.com/emeryberger/CSrankings/issues/4442#issuecomment-1094374606) for this issue, and I think we should consider it ASAP.

Best, Abel

PS: I also think "Machine learning and data mining should be separate categories" (#240, #238).

jaehong31 commented 2 years ago

Totally agreed.

mbrubake commented 2 years ago

Agreed. According to the 2022 Google Scholar Metrics, ICLR is now 8th overall and out ranks NeurIPS at 9th: https://scholar.google.com/citations?view_op=top_venues Not including ICLR in these rankings makes no sense.

The larger issue here is that field definitions and data updates appear to be all bottlenecked through one person, @emeryberger. If CSRankings is to be sustainable and remain relevant it needs to distribute the responsibility/workload. For instance, delegating data update approvals to a small number of people from relevant regions and creating small "working groups" of people who would collectively add or remove conferences and journals from consideration.

Junjie-Chu commented 2 years ago

ICLR is of course a top conference, it is weird the lack of ICLR in CS ranking.

Ben5000 commented 2 years ago

I'm just repeating here my opinion about the justifications given for the claim that ICLR is a "top" conference that "should" be included on CSRankings.

1) Google Scholar does not provide parameters to conclude whether a conference is considered "top" or not. It is unclear whether ICLR is considered "top" as much as ICML for instance.

2) Citation count depends on the absolute number of people (and papers) in a given field. Thus, as a field is bigger, each paper will have on average bigger number of citation (normally; that is, unless citations are completely random, which they are probably not). Therefore, absolute values of citation count cannot serve as a reliable approximation to relative "prestige" and "importance" and "being top". It does show popularity, as you've correctly mentioned. But popularity is not the same as "prestige" or "importance".

For instance, if a field (e.g., AI) is huge in comparison to another field X, then even the least important and prestigious conference in AI would normally have a bigger citation count compared to the top venue in X. But it will not be reasonable to include it here.

3) CSRankings explicitly stated in its About essay that its own ranking diverges from the supposedly erroneous methodology of ranking based on citation count, due to the fact that this measure is prone to over-represent islands of self-citation cliques, for example.

mbrubake commented 2 years ago

Arguments for ICLR being included:

Based on Google Scholar rankings based on both h5-index and h5-median (https://scholar.google.ca/citations?view_op=top_venues&hl=en&vq=eng) ICLR currently out ranks (3rd) all other included machine learning venues included in CSrankings: NeurIPS (4th) and ICML (6th). H5 isn't a perfect metric and we all know the problems with citations as a metric but you can't argue "different fields" as the Venn diagram of attendees/authors at NeurIPS, ICML and ICLR have nearly 90% overlap. You also can't argue "citation islands" as ICLR papers are widely cited in places like NeurIPS, CVPR, etc.
The community of people that publish at ICLR is basically the same as NeurIPS/ICML. The founders and organizers of ICLR are high profile, well known members of the ML community. Yann LeCun, Yoshua Bengio, among many others.
Prestige and importance are far from objectively measurable criteria. However for almost any reasonable definition of them, ICLR would qualify. It is important in that members of the ML community generally are expected to follow the work published there. Failure to cite or compare against a paper that was published at a recent ICLR would be a problem for a submission at NeurIPS or ICML. It is prestigious in that it is well attended by high profile members of the community and acceptance is competitive with an acceptance rate of around 30%.
It is not as "popular" as NeurIPS (e.g., ~3000 submissions at ICLR vs ~10,000 at NeurIPS) but as you point out "popularity" is not the same as prestige or importance.

The specific arguments aside, the larger issue here is that there is not any obvious or objective standards by which a conference is included nor any well defined process by which one can be added.

Ben5000 commented 2 years ago

A quick rebuttal of the arguments above:

Citation count is explicitly not the parameter by which CSRankings works. It is stated explicitly in its aim of existence. So I am not convinced that any argument involving citations consideration is legitimate here. Thus, the first argument and all arguments above seem irrelevant.
The claim that we cannot argue about "citation islands", because ICLR specifically is not a "citation island" is irrelevant (assuming this statement is correct): because "citation island" is a justification to generally disqualify citations as correlating to "prestige", therefore, we cannot change the general criteria only in one case, that is, ICLR, while for the other fields continue not to account for citations.
The argument that the "Venn diagram of attendees/authors at NeurIPS, ICML and ICLR have nearly 90% overlap" can only serve to strengthen the argument against including ICLR: it would be unreasonable to include both venues given that, for instance, other fields use only one or two top conference, given a fixed community, roughly.
Some other arguments raised here about the "prestige" seems to be opinion-based, and are based on statements that are hard to check; not that I argue they are incorrect. I simply don't know.
Regrading the process by which this website is run, this may be correct, but everyone is welcome as far as I see to open their own branch and design the metric differently. This would be interesting and was already done if I'm not mistaken.

mbrubake commented 2 years ago

Most would likely agree that ML has three top conferences at this point: NeurIPS, ICLR and ICML. This is in line with, for instance, computer vision which also has three: ICCV, ECCV and CVPR. Why KDD (and data mining) is lumped in with NeurIPS and ICML is beyond me. Nothing against KDD, but datamining is generally a distinct field and community these days, perhaps best included with information retrieval.

Yes, some of the my arguments above are based on opinion and hard to check. But you've rejected the easy to check, metric-based arguments. Note that I wasn't arguing that because ICLR has X citations and a Y h5-index it should be included, but rather that because ICLR is now consistently in the same range as others that it should be included or at least a coherent reason should be provided for why it isn't.

It seems at this point the entire selection criteria is based on opinion. Ironic that the goal of CSrankings is to be "entirely metrics-based" and transparent. Yet this is a fundamental aspect and yet appears by your argument to be extremely non-metric based.

I have attended ML conferences and interacted with members of that community for 18 years. I talk with members of the community on a regular basis both in academia and industry. But again, no one has to take my opinion for it and probably shouldn't, that's fine. Setup per-area advisory panels, send out community surveys, define clear metric-based criteria, or something. The lack of a process or any clear criteria is disheartening. I have definitely considered branching it and establishing my own ranking but honestly I just don't want to. The amount of work is high and the payoff for another metric is low. Who knows, maybe I will some day but I hope that it wouldn't have to come to that. I'd rather try to work with others to improve this one...

jaehong31 commented 2 years ago

Why don't you simply vote for CSRanking users to gather their opinions on whether or not they agree that adding the metric of ICLR publications will be valuable and help them to design their metrics according to their tastes? by using twits or some other ways lol.

Ben5000 commented 2 years ago

Regarding voting about which conference should be included here: it seems unreasonable, since this way the most unselective conferences, accepting the largest amount of papers, with the lowest acceptance threshold, would normally be voted in. Because, participants of Every-Day Conference B with 10000 papers a year easily would outvote participants of Extremely-Prestigious Conference A with 200 papers a year.

Note that CSRankings is not perfect, like any other ranking. It is what it is: a ranking of accumulative number of papers appearing in conferences of which list was selected by "leaders in the field" through some sort of "feedback or poll". It precisely comes to fix the problem with other rankings that considered all sort of non-transparent and "citation counts" and "impact factors", which may be even less reliable than asking around "leaders in the field about important conferences".