lovecambi / qebrain

machine translation and quality estimation
BSD 2-Clause "Simplified" License
34 stars 18 forks source link

How to make it work on a classification task #7

Closed hittle2015 closed 5 years ago

hittle2015 commented 5 years ago

If I am going to work on classification task instead of regression one. Which part of the loss function should I change? I assume the relevant part should start from line 1341 in the expert_model.py file, but I am not sure.

lovecambi commented 5 years ago

If I am going to work on classification task instead of regression one. Which part of the loss function should I change? I assume the relevant part should start from line 1341 in the expert_model.py file, but I am not sure.

I think you should first modify from this line https://github.com/lovecambi/qebrain/blob/0d818f7f0d699cd8f487a057385a44a94d84426a/qe_model.py#L1319

# sent_logits = tf.layers.dense(sent_fea, 1)
# sent_logits = tf.squeeze(sent_logits)
sent_logits = tf.layers.dense(sent_fea, num_of_classes)

Next, modify from this line https://github.com/lovecambi/qebrain/blob/0d818f7f0d699cd8f487a057385a44a94d84426a/qe_model.py#L1344

# sent_pred = tf.sigmoid(sent_logits)
# sent_loss += tf.reduce_mean(tf.square(sent_label - sent_pred))
sent_loss += tf.nn.softmax_cross_entropy_with_logits_v2(label=self.sent_label, logits=sent_logits)
hittle2015 commented 5 years ago

Hi, Thanks a lot. I assume the last line should be "" sent_loss += tf.nn.softmax_cross_entropy_with_logits_v2(labels=self.label, logits=sent_logits) ""

However, I got the following error: File "qe_model.py", line 2193, in tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/home/yuyuan/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "qe_model.py", line 2186, in main run_main(FLAGS, default_hparams, train, inference) File "qe_model.py", line 2181, in run_main train_fn(hparams, target_session=target_session) File "qe_model.py", line 1715, in train train_model = create_train_model(model_creator, hparams, scope) File "qe_model.py", line 751, in create_train_model scope=scope) File "qe_model.py", line 1153, in init colocate_gradients_with_ops=True) File "/home/yuyuan/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/optimizers.py", line 155, in optimize_loss contrib_framework.assert_scalar(loss) File "/home/yuyuan/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow/python/ops/check_ops.py", line 1264, in assert_scalar % (tensor.name, shape)) ValueError: Expected scalar shape for transformerpredictor/add_2:0, saw shape: (1,).


so, my question is 'do i need to preprocess the labels to one-hot format, or just leave them to a single column?', I guess the error might be related the loss function or my class labels. Thanks in advance.

lovecambi commented 5 years ago

Hi, Thanks a lot. I assume the last line should be "" sent_loss += tf.nn.softmax_cross_entropy_with_logits_v2(labels=self.label, logits=sent_logits) ""

However, I got the following error: File "qe_model.py", line 2193, in tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/home/yuyuan/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "qe_model.py", line 2186, in main run_main(FLAGS, default_hparams, train, inference) File "qe_model.py", line 2181, in run_main train_fn(hparams, target_session=target_session) File "qe_model.py", line 1715, in train train_model = create_train_model(model_creator, hparams, scope) File "qe_model.py", line 751, in create_train_model scope=scope) File "qe_model.py", line 1153, in init colocate_gradients_with_ops=True) File "/home/yuyuan/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/optimizers.py", line 155, in optimize_loss contrib_framework.assert_scalar(loss) File "/home/yuyuan/anaconda3/envs/tf/lib/python3.6/site-packages/tensorflow/python/ops/check_ops.py", line 1264, in assert_scalar % (tensor.name, shape)) ValueError: Expected scalar shape for transformerpredictor/add_2:0, saw shape: (1,).

so, my question is 'do i need to preprocess the labels to one-hot format, or just leave them to a single column?', I guess the error might be related the loss function or my class labels. Thanks in advance.

Please refer https://stackoverflow.com/questions/37312421/whats-the-difference-between-sparse-softmax-cross-entropy-with-logits-and-softm