allenai / s2_fos

Apache License 2.0
32 stars 2 forks source link

`LABELS` in `s2_fos/__init__.py` is misleading. #14

Closed narayanacharya6 closed 1 year ago

narayanacharya6 commented 1 year ago

The label space as per this field indicates that model can predict the following:

>>> from s2_fos import LABELS
>>> sorted(list(set(LABELS)))
['Art',
 'Biology',
 'Business',
 'Chemistry',
 'Computer science',
 'Economics',
 'Education',
 'Engineering',
 'Environmental science',
 'Geography',
 'Geology',
 'History',
 'Law',
 'Linguistics',
 'Materials science',
 'Mathematics',
 'Medicine',
 'Philosophy',
 'Physics',
 'Political science',
 'Psychology',
 'Sociology']

But the model provided with the repo (in the README) suggested the classifier outputs the following labels

>>> from s2_fos import S2FOS
>>> data_dir = 'data/'
>>> s2ranker = S2FOS(data_dir)
>>> sorted(s2ranker._mlb.classes_)
['Agricultural and Food sciences',
 'Art',
 'Biology',
 'Business',
 'Chemistry',
 'Computer science',
 'Economics',
 'Education',
 'Engineering',
 'Environmental science',
 'Geography',
 'Geology',
 'History',
 'Law',
 'Linguistics',
 'Materials science',
 'Mathematics',
 'Medicine',
 'Philosophy',
 'Physics',
 'Political science',
 'Psychology',
 'Sociology']

Note that the label Agricultural and Food sciences is also output by the model.

sergeyf commented 1 year ago

Oops, thanks for catching!