conda create -n sdnet python=3.8.5
conda activate sdnet
bash env.sh
Part of pretrianed data is in the pretrain_data folder, the file includes 200k instances.
Dataset is in the data folder:
data/DATASET/
├── test.json
├── kshot.json/full.json
└── mapping.json
Instance format: Each instance is a Dict, containing tokens
and entity
fields, in which tokens
is the list of tokens, and entity
is the list of entity mentions.
{
"tokens": [token1,token2,...],
"entity": [
[
{"text":mention1, "type": type1, "offset":[startindex1,endindex1]},
{"text":mention2, "type": type2, "offset":[startindex2,endindex2]},
...
]
},
kshot.json/full.json: the data file for k-shot fine-tuning, each line is a Dict, containing support
and target_label
fields, in which support
is the list of instances in support set (full training set in full.json), and target_label
is the list of target novel entity types.
mapping.json: a Dict mapping, the key is label name, the value is mapping words for each label (is commonly label name).
The pretrained SDNet (sdnet.th) should be putted in folder sdnetpretrain
You can download the pretrained SDNet in this link or line.
run:
python main.py -dataset DATASET -K 5 -sdnet -cuda DEVICE
The predicted result is saved in tmp/dataset/...
just add -evalue:
python main.py -dataset DATASET -K 5 -sdnet -cuda DEVICE -evalue
The code is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License for Noncommercial use only. Any commercial use should get formal permission first.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.