YujiaBao / Distributional-Signatures

"Few-shot Text Classification with Distributional Signatures" ICLR 2020
https://arxiv.org/abs/1908.06039
MIT License
254 stars 57 forks source link

Bert+Proto issue #36

Closed ShawnTing1 closed 2 years ago

ShawnTing1 commented 2 years ago

I ran 5-way 1-shot and 5-way 5-shot classification on HuffPost and FewRel using BERT, embedding=proto. However, the obtained results are quite different from those in Table 2. Is this a common problem now? If possible, can you share the commands of the two datasets when embedding=proto and bert=true? I would like to know what went wrong causing this result.

`Parameters: AUXILIARY=[] BERT=True BERT_CACHE_DIR=~/.pytorch_pretrained_bert/ CLASSIFIER=proto CLIP_GRAD=None CNN_FILTER_SIZES=[3, 4, 5] CNN_NUM_FILTERS=50 CUDA=0 DATA_PATH=data/huffpost_bert_uncase.json DATASET=huffpost DROPOUT=0.1 EMBEDDING=cnn FINETUNE_EBD=False FINETUNE_EPISODES=10 FINETUNE_LOSS_TYPE=softmax FINETUNE_MAXEPOCHS=5000 FINETUNE_SPLIT=0.8 INDUCT_ATT_DIM=64 INDUCT_HIDDEN_DIM=100 INDUCT_ITER=3 INDUCT_RNN_DIM=128 LR=0.001 LRD2_NUM_ITERS=5 MAML=False MODE=train N_TEST_CLASS=16 N_TRAIN_CLASS=20 N_VAL_CLASS=5 N_WORKERS=10 NOTQDM=False PATIENCE=20 PRETRAINED_BERT=bert-base-uncased PROTO_HIDDEN=[300, 300] QUERY=25 RESULT_PATH= SAVE=False SEED=330 SHOT=1 SNAPSHOT= TEST_EPISODES=1000 TRAIN_EPISODES=100 TRAIN_EPOCHS=1000 VAL_EPISODES=100 WAY=5 WORD_VECTOR=wiki.en.vec WV_PATH=./

(Credit: Maija Haavisto)                        /
                             _,.------....___,.' ',.-.
                          ,-'          _,.--'        |
                        ,'         _.-'              .
                       /   ,     ,'                   `
                      .   /     /                     ``.
                      |  |     .                       \.\
            ____      |___._.  |       __               \ `.
          .'    `---''       ``'-.--''`  \               .  \
         .  ,            __               `              |   .
         `,'         ,-''  .               \             |    L
        ,'          '    _.'                -._          /    |
       ,`-.    ,'.   `--'                      >.      ,'     |
      . .'\'   `-'       __    ,  ,-.         /  `.__.-      ,'
      ||:, .           ,'  ;  /  / \ `        `.    .      .'/
      j|:D  \          `--'  ' ,'_  . .         `.__, \   , /
     / L:_  |                 .  '' :_;                `.'.'
     .    '''                  ''''''                    V
      `.                                 .    `.   _,..  `
        `,_   .    .                _,-'/    .. `,'   __  `
         ) \`._        ___....----''  ,'   .'  \ |   '  \  .
        /   `. '`-.--''         _,' ,'     `---' |    `./  |
       .   _  `'''--.._____..--'   ,             '         |
       | .' `. `-.                /-.           /          ,
       | `._.'    `,_            ;  /         ,'          .
      .'          /| `-.        . ,'         ,           ,
      '-.__ __ _,','    '`-..___;-...__   ,.'\ ____.___.'
      `'^--'..'   '-`-^-''--    `-^-'`.'''''''`.,^.`.--' mh

22/09/30 08:49:09: Loading data 22/09/30 08:49:09: Class balance: {19: 900, 4: 900, 5: 900, 8: 900, 1: 900, 13: 900, 31: 900, 16: 900, 36: 900, 39: 900, 14: 900, 11: 900, 23: 900, 17: 900, 7: 900, 21: 900, 26: 900, 12: 900, 18: 900, 37: 900, 6: 900, 22: 900, 40: 900, 15: 900, 29: 900, 10: 900, 35: 900, 38: 900, 9: 900, 25: 900, 30: 900, 20: 900, 3: 900, 27: 900, 24: 900, 34: 900, 33: 900, 32: 900, 0: 900, 2: 900, 28: 900} 22/09/30 08:49:09: Avg len: 13.077235772357724 22/09/30 08:49:09: Loading word vectors 22/09/30 08:49:15: Total num. of words: 9376, word vector dimension: 300 22/09/30 08:49:15: Num. of out-of-vocabulary words(they are initialized to zeros): 1586 22/09/30 08:49:15: #train 18000, #val 4500, #test 14400 22/09/30 08:49:18, Building embedding 22/09/30 08:49:18, Loading pretrained bert Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias']

YujiaBao commented 2 years ago

Your command looks right to me. We believe that the issue come from the fact that the tokenizations have been changed. See https://github.com/YujiaBao/Distributional-Signatures/issues/32

ShawnTing1 commented 2 years ago

Thank you very much for your quick reply, I see the reason for the problem.