Distinguishing articles that solve the NCD problem against other related problems

ColinTr commented 2 years ago

After reviewing most of the articles in this repository, I found that some articles do not solve the NCD problem (i.e.: given a labeled set of known classes and an unlabeled set of different but related classes, discover the classes in the unlabeled dataset).

Some examples include :

The Generalized Novel Category Discovery setting: Main difference with NCD is that at inference, the unlabeled set can contain both known and unknown classes. This includes the following articles :
- "Generalized Category Discovery ".
- "Divide and Conquer: Compositional Experts for Generalized Novel Class Discovery".
- "Towards Open-Set Object Detection and Discovery".
- "OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning".
In "Class-incremental Novel Class Discovery", the authors consider a scenario where after the pre-training stage the labelled data is not available, and they are still concerned with good performance on the known classes. Roughly the same setup is found in "Novel Class Discovery without Forgetting".

While all these articles solve interesting problems, I think it would be beneficial to have a clear objective in this repository, so that it does not become a list of articles that tackle Open World Learning problems.

A solution could be to add tags to articles ? Because the field is young and evolving, many works define new settings with different names, and the differences between them can be unclear…

ColinTr commented 2 years ago

And "Neural network-based clustering using pairwise constraints" is not solving a NCD problem at all. They use the ground-truth class labels to define pseudo-labels, and do not seek to discover novel classes. @JosephKJ what are your thoughts ?

JosephKJ commented 2 years ago

Hi @ColinTr, Thank you very much for sharing your analysis.

Kindly allow me to share my thoughts on the same: As you have rightly pointed it out, NCD is nascent and young. As we as a community evolve, it is natural for us to relax the assumptions that we had initially started off with. For instance, it is illogical to assume that an NCD model would be evaluated only on the novel classes at inference -- this brings in the need for "Generalized Novel Category Discovery". Further, while discovering novel classes, wouldn't it be better to discover classes in the unlabeled pool, without access to the labeled data? This motivates "Novel Class Discovery without Forgetting" and "Class-incremental Novel Class Discovery".

"Neural network-based clustering using pairwise constraints" is important because "Learning to cluster in order to transfer across domains and tasks (ICLR 2018)" directly builds on top of this work.

I still feel that tagging each paper would be a very good way to organize the different works. Would you have the bandwidth to submit a pull-request with the same?

Sorry for my slow response. These days are pretty hectic. Thanks in advance, and looking forward to hearing from you.

ColinTr commented 2 years ago

I have created a pull request where I added tags in front of each article, along with a description of the different settings at the bottom of the document.

Also, the preprint "Demystifying Assumptions in Learning to Discover Novel Classes" links to an already existing and different article, should it be removed ?

Apologies for the slow response as well, this was a lot of work that I did bit by bit.

JosephKJ / Awesome-Novel-Class-Discovery

Distinguishing articles that solve the NCD problem against other related problems #11