Tencent / NeuralNLP-NeuralClassifier

An Open-source Neural Hierarchical Multi-label Text Classification Toolkit
Other
1.83k stars 402 forks source link

f-score 等于 0 #105

Closed zengzhixian1995 closed 2 years ago

zengzhixian1995 commented 2 years ago

使用下面这个配置 { "task_info":{ "label_type": "multi_label", "hierarchical": false, "hierar_taxonomy": "data/rcv1.taxonomy", "hierar_penalty": 0.000001 }, "device": "cuda", "model_name": "TextRCNN", "checkpoint_dir": "checkpoint_dir_mic3_flat", "model_dir": "trained_model_mic3_flat", "data": { "train_json_files": [ "data/mic3_train.json" ], "validate_json_files": [ "data/mic3_test.json" ], "test_json_files": [ "data/mic3_test.json" ], "generate_dict_using_json_files": true, "generate_dict_using_all_json_files": true, "generate_dict_using_pretrained_embedding": false, "generate_hierarchy_label": true, "dict_dir": "dict_mic3_flat", "num_worker": 4 }, "feature": { "feature_names": [ "token" ], "min_token_count": 2, "min_char_count": 2, "token_ngram": 0, "min_token_ngram_count": 0, "min_keyword_count": 0, "min_topic_count": 2, "max_token_dict_size": 1000000, "max_char_dict_size": 150000, "max_token_ngram_dict_size": 10000000, "max_keyword_dict_size": 100, "max_topic_dict_size": 100, "max_token_len": 512, "max_char_len": 1024, "max_char_len_per_token": 4, "token_pretrained_file": "", "keyword_pretrained_file": "" }, "train": { "batch_size": 64, "start_epoch": 1, "num_epochs": 50, "num_epochs_static_embedding": 0, "decay_steps": 1000, "decay_rate": 1.0, "clip_gradients": 100.0, "l2_lambda": 0.0, "loss_type": "BCEWithLogitsLoss", "sampler": "fixed", "num_sampled": 5, "visible_device_list": "0", "hidden_layer_dropout": 0.5 }, "embedding": { "type": "embedding", "dimension": 64, "region_embedding_type": "context_word", "region_size": 5, "initializer": "uniform", "fan_mode": "FAN_IN", "uniform_bound": 0.25, "random_stddev": 0.01, "dropout": 0.0 }, "optimizer": { "optimizer_type": "Adam", "learning_rate": 0.008, "adadelta_decay_rate": 0.95, "adadelta_epsilon": 1e-08 }, "TextCNN": { "kernel_sizes": [ 2, 3, 4 ], "num_kernels": 100, "top_k_max_pooling": 1 }, "TextRNN": { "hidden_dimension": 64, "rnn_type": "GRU", "num_layers": 1, "doc_embedding_type": "Attention", "attention_dimension": 16, "bidirectional": true }, "DRNN": { "hidden_dimension": 5, "window_size": 3, "rnn_type": "GRU", "bidirectional": true, "cell_hidden_dropout": 0.1 }, "eval": { "text_file": "data/mic3_test.json", "threshold": 0.5, "dir": "eval_dir", "batch_size": 1024, "is_flat": true, "top_k": 100, "model_dir": "checkpoint_dir_mic3_flat/TextRCNN_best" }, "TextVDCNN": { "vdcnn_depth": 9, "top_k_max_pooling": 8 }, "DPCNN": { "kernel_size": 3, "pooling_stride": 2, "num_kernels": 16, "blocks": 2 }, "TextRCNN": { "kernel_sizes": [ 2, 3, 4 ], "num_kernels": 100, "top_k_max_pooling": 1, "hidden_dimension":64, "rnn_type": "GRU", "num_layers": 1, "bidirectional": true }, "Transformer": { "d_inner": 128, "d_k": 32, "d_v": 32, "n_head": 4, "n_layers": 1, "dropout": 0.1, "use_star": true }, "AttentiveConvNet": { "attention_type": "bilinear", "margin_size": 3, "type": "advanced", "hidden_size": 64 }, "HMCN": { "hierarchical_depth": [0, 384, 384], "global2local": [0, 19, 370] }, "log": { "logger_file": "log_test_mic3_flat", "log_level": "warn" } } 运行到8个epoch后,f-score等于0 Use dataset to generate dict. Size of doc_label dict is 311 Size of doc_token dict is 547597 Size of doc_char dict is 2022 Size of doc_token_ngram dict is 0 Size of doc_keyword dict is 0 Size of doc_topic dict is 0 Shrink dict over. Size of doc_label dict is 311 Size of doc_token dict is 547597 Size of doc_char dict is 2022 Size of doc_token_ngram dict is 0 Size of doc_keyword dict is 0 Size of doc_topic dict is 0 Train performance at epoch 1 is precision: 0.812763, recall: 0.544614, fscore: 0.652202, macro-fscore: 0.233602, right: 879927, predict: 1082637, standard: 1615689. Loss is: 0.082759. Validate performance at epoch 1 is precision: 0.792956, recall: 0.522135, fscore: 0.629660, macro-fscore: 0.210286, right: 48651, predict: 61354, standard: 93177. Loss is: 0.088675. test performance at epoch 1 is precision: 0.792956, recall: 0.522135, fscore: 0.629660, macro-fscore: 0.210286, right: 48651, predict: 61354, standard: 93177. Loss is: 0.088675. Epoch 1 cost time: 370 second Train performance at epoch 2 is precision: 0.847037, recall: 0.605841, fscore: 0.706418, macro-fscore: 0.314665, right: 978850, predict: 1155616, standard: 1615689. Loss is: 0.073577. Validate performance at epoch 2 is precision: 0.816315, recall: 0.576494, fscore: 0.675758, macro-fscore: 0.275915, right: 53716, predict: 65803, standard: 93177. Loss is: 0.081481. test performance at epoch 2 is precision: 0.816315, recall: 0.576494, fscore: 0.675758, macro-fscore: 0.275915, right: 53716, predict: 65803, standard: 93177. Loss is: 0.081481. Epoch 2 cost time: 370 second Train performance at epoch 3 is precision: 0.848328, recall: 0.678276, fscore: 0.753831, macro-fscore: 0.408692, right: 1095883, predict: 1291815, standard: 1615689. Loss is: 0.065184. Validate performance at epoch 3 is precision: 0.803087, recall: 0.633708, fscore: 0.708414, macro-fscore: 0.332396, right: 59047, predict: 73525, standard: 93177. Loss is: 0.075041. test performance at epoch 3 is precision: 0.803087, recall: 0.633708, fscore: 0.708414, macro-fscore: 0.332396, right: 59047, predict: 73525, standard: 93177. Loss is: 0.075041. Epoch 3 cost time: 371 second Train performance at epoch 4 is precision: 0.872000, recall: 0.717715, fscore: 0.787371, macro-fscore: 0.469594, right: 1159605, predict: 1329823, standard: 1615689. Loss is: 0.059711. Validate performance at epoch 4 is precision: 0.821359, recall: 0.654561, fscore: 0.728535, macro-fscore: 0.371412, right: 60990, predict: 74255, standard: 93177. Loss is: 0.072544. test performance at epoch 4 is precision: 0.821359, recall: 0.654561, fscore: 0.728535, macro-fscore: 0.371412, right: 60990, predict: 74255, standard: 93177. Loss is: 0.072544. Epoch 4 cost time: 377 second Train performance at epoch 5 is precision: 0.885457, recall: 0.753004, fscore: 0.813877, macro-fscore: 0.519633, right: 1216620, predict: 1374002, standard: 1615689. Loss is: 0.049501. Validate performance at epoch 5 is precision: 0.822503, recall: 0.674319, fscore: 0.741076, macro-fscore: 0.389255, right: 62831, predict: 76390, standard: 93177. Loss is: 0.063937. test performance at epoch 5 is precision: 0.822503, recall: 0.674319, fscore: 0.741076, macro-fscore: 0.389255, right: 62831, predict: 76390, standard: 93177. Loss is: 0.063937. Epoch 5 cost time: 374 second Train performance at epoch 6 is precision: 0.894424, recall: 0.783785, fscore: 0.835457, macro-fscore: 0.566551, right: 1266352, predict: 1415829, standard: 1615689. Loss is: 0.046750. Validate performance at epoch 6 is precision: 0.820941, recall: 0.693540, fscore: 0.751882, macro-fscore: 0.414829, right: 64622, predict: 78717, standard: 93177. Loss is: 0.062976. test performance at epoch 6 is precision: 0.820941, recall: 0.693540, fscore: 0.751882, macro-fscore: 0.414829, right: 64622, predict: 78717, standard: 93177. Loss is: 0.062976. Epoch 6 cost time: 376 second Train performance at epoch 7 is precision: 0.895922, recall: 0.806055, fscore: 0.848616, macro-fscore: 0.611395, right: 1302334, predict: 1453625, standard: 1615689. Loss is: 0.043302. Validate performance at epoch 7 is precision: 0.817573, recall: 0.704444, fscore: 0.756804, macro-fscore: 0.433294, right: 65638, predict: 80284, standard: 93177. Loss is: 0.061460. test performance at epoch 7 is precision: 0.817573, recall: 0.704444, fscore: 0.756804, macro-fscore: 0.433294, right: 65638, predict: 80284, standard: 93177. Loss is: 0.061460. Epoch 7 cost time: 368 second Train performance at epoch 8 is precision: 0.912939, recall: 0.823504, fscore: 0.865918, macro-fscore: 0.645184, right: 1330527, predict: 1457411, standard: 1615689. Loss is: 0.037199. Validate performance at epoch 8 is precision: 0.827381, recall: 0.710401, fscore: 0.764442, macro-fscore: 0.441673, right: 66193, predict: 80003, standard: 93177. Loss is: 0.057151. test performance at epoch 8 is precision: 0.827381, recall: 0.710401, fscore: 0.764442, macro-fscore: 0.441673, right: 66193, predict: 80003, standard: 93177. Loss is: 0.057151. Epoch 8 cost time: 369 second Train performance at epoch 9 is precision: 0.000000, recall: 0.000000, fscore: 0.000000, macro-fscore: 0.000000, right: 0, predict: 0, standard: 1615689. Loss is: nan. Validate performance at epoch 9 is precision: 0.000000, recall: 0.000000, fscore: 0.000000, macro-fscore: 0.000000, right: 0, predict: 0, standard: 93177. Loss is: nan. test performance at epoch 9 is precision: 0.000000, recall: 0.000000, fscore: 0.000000, macro-fscore: 0.000000, right: 0, predict: 0, standard: 93177. Loss is: nan. Epoch 9 cost time: 328 second Train performance at epoch 10 is precision: 0.000000, recall: 0.000000, fscore: 0.000000, macro-fscore: 0.000000, right: 0, predict: 0, standard: 1615689. Loss is: nan. Validate performance at epoch 10 is precision: 0.000000, recall: 0.000000, fscore: 0.000000, macro-fscore: 0.000000, right: 0, predict: 0, standard: 93177. Loss is: nan. test performance at epoch 10 is precision: 0.000000, recall: 0.000000, fscore: 0.000000, macro-fscore: 0.000000, right: 0, predict: 0, standard: 93177. Loss is: nan. Epoch 10 cost time: 334 second Train performance at epoch 11 is precision: 0.000000, recall: 0.000000, fscore: 0.000000, macro-fscore: 0.000000, right: 0, predict: 0, standard: 1615689. Loss is: nan. Validate performance at epoch 11 is precision: 0.000000, recall: 0.000000, fscore: 0.000000, macro-fscore: 0.000000, right: 0, predict: 0, standard: 93177. Loss is: nan. test performance at epoch 11 is precision: 0.000000, recall: 0.000000, fscore: 0.000000, macro-fscore: 0.000000, right: 0, predict: 0, standard: 93177. Loss is: nan.

coderbyr commented 2 years ago

从log日志看loss变为nan, 可以从几个方面检查下(1)使用简单的分类模型如textCNN验证下是否会出现同样问题;(2)检查下训练样本中是否有不合法的数据 (3)梯度是否发生爆炸,增加梯度裁剪;(4)设置较小的学习率;