emeryberger / CSrankings

A web app for ranking computer science departments according to their research output in selective venues, and for finding active faculty across a wide range of areas.
http://csrankings.org
Other
2.69k stars 3.17k forks source link

Please please add ICLR #4683

Closed wpzdm closed 1 year ago

wpzdm commented 2 years ago

Hi,

CSranknigs is a great tool and helped a lot when finding my PhD. I had already noticed there was no ICLR back then (c. 2017~2018). Now, I'd like to use it for my post-doc finding, but ICLR is still missing! I think this is a major drawback as of 2022, and I prefer http://csmetrics.org/ for my purpose now. As mentioned in the many other issues (e.g., #4442, #3518, #240), ICLR, among Neurips and ICML, is a top conference in AI&ML, and even has a higher h5-index than the other two! I really hope this problem could be fixed for such a great tool...Now, there is a PR (https://github.com/emeryberger/CSrankings/issues/4442#issuecomment-1094374606) for this issue, and I think we should consider it ASAP.

Best, Abel

PS: I also think "Machine learning and data mining should be separate categories" (#240, #238).

jaehong31 commented 2 years ago

Totally agreed.

mbrubake commented 2 years ago

Agreed. According to the 2022 Google Scholar Metrics, ICLR is now 8th overall and out ranks NeurIPS at 9th: https://scholar.google.com/citations?view_op=top_venues Not including ICLR in these rankings makes no sense.

The larger issue here is that field definitions and data updates appear to be all bottlenecked through one person, @emeryberger. If CSRankings is to be sustainable and remain relevant it needs to distribute the responsibility/workload. For instance, delegating data update approvals to a small number of people from relevant regions and creating small "working groups" of people who would collectively add or remove conferences and journals from consideration.

Junjie-Chu commented 2 years ago

ICLR is of course a top conference, it is weird the lack of ICLR in CS ranking.

Ben5000 commented 2 years ago

I'm just repeating here my opinion about the justifications given for the claim that ICLR is a "top" conference that "should" be included on CSRankings.

1) Google Scholar does not provide parameters to conclude whether a conference is considered "top" or not. It is unclear whether ICLR is considered "top" as much as ICML for instance.

2) Citation count depends on the absolute number of people (and papers) in a given field. Thus, as a field is bigger, each paper will have on average bigger number of citation (normally; that is, unless citations are completely random, which they are probably not). Therefore, absolute values of citation count cannot serve as a reliable approximation to relative "prestige" and "importance" and "being top". It does show popularity, as you've correctly mentioned. But popularity is not the same as "prestige" or "importance".

For instance, if a field (e.g., AI) is huge in comparison to another field X, then even the least important and prestigious conference in AI would normally have a bigger citation count compared to the top venue in X. But it will not be reasonable to include it here.

3) CSRankings explicitly stated in its About essay that its own ranking diverges from the supposedly erroneous methodology of ranking based on citation count, due to the fact that this measure is prone to over-represent islands of self-citation cliques, for example.

mbrubake commented 2 years ago

Arguments for ICLR being included:

The specific arguments aside, the larger issue here is that there is not any obvious or objective standards by which a conference is included nor any well defined process by which one can be added.

Ben5000 commented 2 years ago

A quick rebuttal of the arguments above:

mbrubake commented 2 years ago

Most would likely agree that ML has three top conferences at this point: NeurIPS, ICLR and ICML. This is in line with, for instance, computer vision which also has three: ICCV, ECCV and CVPR. Why KDD (and data mining) is lumped in with NeurIPS and ICML is beyond me. Nothing against KDD, but datamining is generally a distinct field and community these days, perhaps best included with information retrieval.

Yes, some of the my arguments above are based on opinion and hard to check. But you've rejected the easy to check, metric-based arguments. Note that I wasn't arguing that because ICLR has X citations and a Y h5-index it should be included, but rather that because ICLR is now consistently in the same range as others that it should be included or at least a coherent reason should be provided for why it isn't.

It seems at this point the entire selection criteria is based on opinion. Ironic that the goal of CSrankings is to be "entirely metrics-based" and transparent. Yet this is a fundamental aspect and yet appears by your argument to be extremely non-metric based.

I have attended ML conferences and interacted with members of that community for 18 years. I talk with members of the community on a regular basis both in academia and industry. But again, no one has to take my opinion for it and probably shouldn't, that's fine. Setup per-area advisory panels, send out community surveys, define clear metric-based criteria, or something. The lack of a process or any clear criteria is disheartening. I have definitely considered branching it and establishing my own ranking but honestly I just don't want to. The amount of work is high and the payoff for another metric is low. Who knows, maybe I will some day but I hope that it wouldn't have to come to that. I'd rather try to work with others to improve this one...

jaehong31 commented 2 years ago

Why don't you simply vote for CSRanking users to gather their opinions on whether or not they agree that adding the metric of ICLR publications will be valuable and help them to design their metrics according to their tastes? by using twits or some other ways lol.

Ben5000 commented 2 years ago

Regarding voting about which conference should be included here: it seems unreasonable, since this way the most unselective conferences, accepting the largest amount of papers, with the lowest acceptance threshold, would normally be voted in. Because, participants of Every-Day Conference B with 10000 papers a year easily would outvote participants of Extremely-Prestigious Conference A with 200 papers a year.

Note that CSRankings is not perfect, like any other ranking. It is what it is: a ranking of accumulative number of papers appearing in conferences of which list was selected by "leaders in the field" through some sort of "feedback or poll". It precisely comes to fix the problem with other rankings that considered all sort of non-transparent and "citation counts" and "impact factors", which may be even less reliable than asking around "leaders in the field about important conferences".