castorini / hedwig

PyTorch deep learning models for document classification
Apache License 2.0
593 stars 125 forks source link

Getting impossible predicted labels (all zeroes) from custom data #72

Open wailoktam opened 3 years ago

wailoktam commented 3 years ago

Hi, I create a dataset with the following categories:

classDict = {"text/dokujo-tsushin": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001", "text/it-life-hack": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000010", "text/kaden-channel": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000100", "text/livedoor-homme": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000001000", "text/movie-enter": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000010000", "text/peachy": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000100000", "text/smax": "000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000", "text/sports-watch": "000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000", "text/topic-news": "000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000"}

I make sure that the train, dev and test tsv file has only these arrays of zeros and ones. (using grep) However, I am getting the predicted label all zeros. Can you tell me what is likely to cause this strange result. Thanks.