hiroki13 / span-based-srl

46 stars 9 forks source link

!!! Important --- Elmo Model and F1 score #9

Open MrSingh-bytes opened 2 years ago

MrSingh-bytes commented 2 years ago

Hello Everyone,

  1. I don't have Conll2005 data but I used Conll-formatted-ontonotes 5.0 data as Conll2012 and the training part done. But I got f1 score around 46% with "Senna" embedding and it is very low as compared to the results mentioned in the paper. I am using MacBook Air M1.

    - EPOCH-1       BEST VALID   9.29%
    - EPOCH-2       BEST VALID  13.50%
    - EPOCH-3       BEST VALID  22.67%
    - EPOCH-6       BEST VALID  25.98%
    - EPOCH-8       BEST VALID  30.08%
    - EPOCH-10      BEST VALID  33.26%
    - EPOCH-11      BEST VALID  37.21%
    - EPOCH-15      BEST VALID  38.16%
    - EPOCH-20      BEST VALID  42.09%
    - EPOCH-36      BEST VALID  43.60%
    - EPOCH-41      BEST VALID  43.83%
    - EPOCH-46      BEST VALID  45.48%
    - EPOCH-69      BEST VALID  45.73%
    - EPOCH-75      BEST VALID  45.89%
    - EPOCH-99      BEST VALID  46.15%

Could you please let me know why I am getting low score?

  1. Also I am not getting how to get the Elmo model -- I downloaded the files from these urls.

wget https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway/elmo_2x4096_512_2048cnn_2xhighway_options.json

wget https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x4096_512_2048cnn_2xhighway/elmo_2x4096_512_2048cnn_2xhighway_weights.hdf5

But How will I get these files --> "--train_elmo_emb path/to/elmo.conll2005.train.hdf5 --dev_elmo_emb path/to/elmo.conll2005.dev.hdf5". I only got options and weight file but what the process to get these files elmo.conll2005.train.hdf5 and elmo.conll2005.dev.hdf5.

  1. Do we need to merge all the train files in Conll-formatted-ontonotes 5.0 - these are in _.gold_conll format? For example :

.train └── data └── english └── annotations ├── bc
│   ├── cctv │   │   └── 00 ------> All files in this folder ; should I merge all files in these folder? │   ├── cnn │   │   └── 00 │   ├── msnbc │   │   └── 00 ------> All files in this folder ; should I merge all files in these folder? │   ├── p2.5_a2e │   │   └── 00 │   ├── p2.5_c2e │   │   └── 00 │   └── phoenix │   └── 00 ├── bn │   ├── abc │   │   └── 00 │   ├── cnn │   │   ├── 00 │   │   ├── 01 │   │   ├── 02 │   │   ├── 03 │   │   └── 04

Thanks in Advance !!! 👍