Tencent / NeuralNLP-NeuralClassifier

An Open-source Neural Hierarchical Multi-label Text Classification Toolkit
Other
1.83k stars 402 forks source link

Fix HMCN missing in evaluation #108

Closed MagiaSN closed 2 years ago

MagiaSN commented 2 years ago

FIx #107

alisea47 commented 2 years ago

你好,好像代码还存在一些问题,以下是个人的看法以及遇到的问题,请指教~

  1. conf/train.hmcn.json 文件需要修改。
    
    "data": {
    "train_json_files": [
      "data/rcv1_merged.hierar.json"   //应修改为 "data/rcv1_train.hierar.json"
    ],
    "validate_json_files": [
      "data/rcv1_test.hierar.json"      //应修改为 "data/rcv1_dev.hierar.json"
    ],
    "test_json_files": [
      "data/rcv1_test.hierar.json" 
    ],

……

"eval": { "text_file": "data/rcv1_test.json", //应修改为 "data/rcv1_test.hierar.json" "threshold": 0.5, "dir": "eval_dir", "batch_size": 1024, "is_flat": true, "top_k": 100, "model_dir": "checkpoint_dir_rcv1/TextCNN_best" //应修改为 "model_dir": "checkpoint_dir_rcv1/HMCN_best" },

2. 按照上述修改,训练HMCN模型,在evalution时遇到以下报错:

Traceback (most recent call last): File "eval.py", line 114, in eval(config) File "eval.py", line 91, in eval result = torch.sigmoid(logits).cpu().tolist() TypeError: sigmoid(): argument 'input' (position 1) must be Tensor, not tuple


我运行代码的env,python3.6版本

感谢回复~
alisea47 commented 2 years ago

另外我还有一个问题,conf/train.hmcn.json 文件中的"hierarchical"是否需要修改成True呢?HMCN是层次多标签分类,这么看是需要修改的,但是read.me中写了是False?

"task_info": {
    "label_type": "multi_label",
    "hierarchical": false,
    "hierar_taxonomy": "data/rcv1.taxonomy",
    "hierar_penalty": 1e-5
  }
MagiaSN commented 2 years ago

@alisea47 HMCN评估的问题是一个bug,层次分类任务需要设置"hierarchical": true,目前的example中确实存在一些引起混乱的配置,我们一并修复下