zihangdai / xlnet

XLNet: Generalized Autoregressive Pretraining for Language Understanding
Apache License 2.0
6.18k stars 1.18k forks source link

Text Classifier Prediction Problem #224

Open MissMcFly opened 5 years ago

MissMcFly commented 5 years ago

I trained a text classification model based on the XLNet pre-training model and got the corresponding ckpt file. QQ图片20190909105641

Then, based on this text classification model, I made predictions and but it always reported an error. The predicted scripts and errors are as follows. Scripts: python3 chinese_classifier.py --do_predict=True --eval_split=test --task_name=inre --data_dir="/home/luban/IR/data/IR/" --output_dir="/home/luban/IR/output2/tfrecords/" --model_dir="/home/luban/IR/output2/finetunedModel/" --spiece_model_file="/home/luban/IR/sentencePiec/spm.model" --model_config_path="xlnet/modelCkpt/config.json" --init_checkpoint="xlnet/modelCkpt/model.ckpt" --predict_dir="/home/luban/IR/predict/IR/" --predict_ckpt="/home/luban/IR/output2/finetunedModel/model.ckpt-3000" --max_seq_length=128 --predict_batch_size=16 --num_hosts=1 --num_core_per_host=1 --learning_rate=2e-5 --train_steps=3000 --warmup_steps=500 --save_steps=3000 --iterations=500 --dropout=0.05

Results: INFO:tensorflow:Single device mode. I0909 11:01:45.503506 139996976785152 tf_logging.py:115] Single device mode. INFO:tensorflow:Using config: {'_model_dir': '/home/luban/IR/output3/finetunedModel/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 3000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true , '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f53893cd1d0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=500, num_shards=1, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None), '_cluster': None} I0909 11:01:47.359045 139996976785152 tf_logging.py:115] Using config: {'_model_dir': '/home/luban/IR/output3/finetunedModel/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 3000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true , '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f53893cd1d0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=500, num_shards=1, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None), '_cluster': None} WARNING:tensorflow:Estimator's model_fn (<function get_model_fn..model_fn at 0x7f538c010730>) includes params argument, but params are not passed to Estimator. W0909 11:01:47.359888 139996976785152 tf_logging.py:125] Estimator's model_fn (<function get_model_fn..model_fn at 0x7f538c010730>) includes params argument, but params are not passed to Estimator. INFO:tensorflow:Num of eval samples: 5000 I0909 11:01:47.390489 139996976785152 tf_logging.py:115] Num of eval samples: 5000 INFO:tensorflow:Do not overwrite tfrecord /home/luban/IR/output3/tfrecords/0.model.len-128.test.predict.tf_record exists. I0909 11:01:47.390708 139996976785152 tf_logging.py:115] Do not overwrite tfrecord /home/luban/IR/output3/tfrecords/0.model.len-128.test.predict.tf_record exists. INFO:tensorflow:Input tfrecord file /home/luban/IR/output3/tfrecords/0.model.len-128.test.predict.tf_record I0909 11:01:47.390810 139996976785152 tf_logging.py:115] Input tfrecord file /home/luban/IR/output3/tfrecords/0.model.len-128.test.predict.tf_record WARNING:tensorflow:From chinese_classifier.py:562: map_and_batch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.experimental.map_and_batch(...). W0909 11:01:47.412828 139996976785152 tf_logging.py:125] From chinese_classifier.py:562: map_and_batch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.experimental.map_and_batch(...). INFO:tensorflow:Calling model_fn. I0909 11:01:47.428736 139996976785152 tf_logging.py:115] Calling model_fn. INFO:tensorflow:memory input None I0909 11:01:47.440355 139996976785152 tf_logging.py:115] memory input None INFO:tensorflow:Use float type <dtype: 'float32'> I0909 11:01:47.440568 139996976785152 tf_logging.py:115] Use float type <dtype: 'float32'> Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 1551, in zeros output = _constant_if_small(zero, shape, dtype, name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 1508, in _constant_if_small if np.prod(shape) < 1000: File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 2585, in prod initial=initial) File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 83, in _wrapreduction return ufunc.reduce(obj, axis, dtype, out, **passkwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py", line 869, in binary_op_wrapper y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y") File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1050, in convert_to_tensor as_ref=False) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py", line 282, in _dimension_tensor_conversion_function raise ValueError("Cannot convert an unknown Dimension to a Tensor: %s" % d) ValueError: Cannot convert an unknown Dimension to a Tensor: ?

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "chinese_classifier.py", line 914, in tf.app.run() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "chinese_classifier.py", line 883, in main checkpoint_path=FLAGS.predict_ckpt)): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 577, in predict features, None, model_fn_lib.ModeKeys.PREDICT, self.config) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 1195, in _call_model_fn model_fn_results = self._model_fn(features=features, kwargs) File "chinese_classifier.py", line 581, in model_fn FLAGS, features, n_class, is_training) File "/home/luban/IR/chineseClassifier/function_builder.py", line 155, in get_classification_loss input_mask=inp_mask) File "/home/luban/IR/chineseClassifier/xlnet.py", line 222, in init ) = modeling.transformer_xl(tfm_args) File "/home/luban/IR/chineseClassifier/modeling.py", line 500, in transformer_xl dtype=tf_float) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 1560, in zeros shape = ops.convert_to_tensor(shape, dtype=dtypes.int32) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1050, in convert_to_tensor as_ref=False) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 971, in _autopacking_conversion_function return _autopacking_helper(v, dtype, name or "packed") File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 922, in _autopacking_helper constant_op.constant(elem, dtype=dtype, name=str(i))) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py", line 208, in constant value, dtype=dtype, shape=shape, verify_shape=verify_shape)) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_util.py", line 443, in make_tensor_proto nparray = np.array(values, dtype=np_dt) TypeError: int returned non-int (type NoneType)

What should I do? Thank you!

pauseman commented 4 years ago

I trained a text classification model based on the XLNet pre-training model and got the corresponding ckpt file. QQ图片20190909105641

Then, based on this text classification model, I made predictions and but it always reported an error. The predicted scripts and errors are as follows. Scripts: python3 chinese_classifier.py --do_predict=True --eval_split=test --task_name=inre --data_dir="/home/luban/IR/data/IR/" --output_dir="/home/luban/IR/output2/tfrecords/" --model_dir="/home/luban/IR/output2/finetunedModel/" --spiece_model_file="/home/luban/IR/sentencePiec/spm.model" --model_config_path="xlnet/modelCkpt/config.json" --init_checkpoint="xlnet/modelCkpt/model.ckpt" --predict_dir="/home/luban/IR/predict/IR/" --predict_ckpt="/home/luban/IR/output2/finetunedModel/model.ckpt-3000" --max_seq_length=128 --predict_batch_size=16 --num_hosts=1 --num_core_per_host=1 --learning_rate=2e-5 --train_steps=3000 --warmup_steps=500 --save_steps=3000 --iterations=500 --dropout=0.05

Results: INFO:tensorflow:Single device mode. I0909 11:01:45.503506 139996976785152 tf_logging.py:115] Single device mode. INFO:tensorflow:Using config: {'_model_dir': '/home/luban/IR/output3/finetunedModel/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 3000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true , '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f53893cd1d0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=500, num_shards=1, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None), '_cluster': None} I0909 11:01:47.359045 139996976785152 tf_logging.py:115] Using config: {'_model_dir': '/home/luban/IR/output3/finetunedModel/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 3000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true , '_keep_checkpoint_max': 0, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f53893cd1d0>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=500, num_shards=1, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None), '_cluster': None} WARNING:tensorflow:Estimator's model_fn (<function get_model_fn..model_fn at 0x7f538c010730>) includes params argument, but params are not passed to Estimator. W0909 11:01:47.359888 139996976785152 tf_logging.py:125] Estimator's model_fn (<function get_model_fn..model_fn at 0x7f538c010730>) includes params argument, but params are not passed to Estimator. INFO:tensorflow:Num of eval samples: 5000 I0909 11:01:47.390489 139996976785152 tf_logging.py:115] Num of eval samples: 5000 INFO:tensorflow:Do not overwrite tfrecord /home/luban/IR/output3/tfrecords/0.model.len-128.test.predict.tf_record exists. I0909 11:01:47.390708 139996976785152 tf_logging.py:115] Do not overwrite tfrecord /home/luban/IR/output3/tfrecords/0.model.len-128.test.predict.tf_record exists. INFO:tensorflow:Input tfrecord file /home/luban/IR/output3/tfrecords/0.model.len-128.test.predict.tf_record I0909 11:01:47.390810 139996976785152 tf_logging.py:115] Input tfrecord file /home/luban/IR/output3/tfrecords/0.model.len-128.test.predict.tf_record WARNING:tensorflow:From chinese_classifier.py:562: map_and_batch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.experimental.map_and_batch(...). W0909 11:01:47.412828 139996976785152 tf_logging.py:125] From chinese_classifier.py:562: map_and_batch (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.experimental.map_and_batch(...). INFO:tensorflow:Calling model_fn. I0909 11:01:47.428736 139996976785152 tf_logging.py:115] Calling model_fn. INFO:tensorflow:memory input None I0909 11:01:47.440355 139996976785152 tf_logging.py:115] memory input None INFO:tensorflow:Use float type <dtype: 'float32'> I0909 11:01:47.440568 139996976785152 tf_logging.py:115] Use float type <dtype: 'float32'> Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 1551, in zeros output = _constant_if_small(zero, shape, dtype, name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 1508, in _constant_if_small if np.prod(shape) < 1000: File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 2585, in prod initial=initial) File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 83, in _wrapreduction return ufunc.reduce(obj, axis, dtype, out, **passkwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py", line 869, in binary_op_wrapper y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y") File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1050, in convert_to_tensor as_ref=False) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py", line 282, in _dimension_tensor_conversion_function raise ValueError("Cannot convert an unknown Dimension to a Tensor: %s" % d) ValueError: Cannot convert an unknown Dimension to a Tensor: ?

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "chinese_classifier.py", line 914, in tf.app.run() File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "chinese_classifier.py", line 883, in main checkpoint_path=FLAGS.predict_ckpt)): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 577, in predict features, None, model_fn_lib.ModeKeys.PREDICT, self.config) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py", line 1195, in _call_model_fn model_fn_results = self._model_fn(features=features, kwargs) File "chinese_classifier.py", line 581, in model_fn FLAGS, features, n_class, is_training) File "/home/luban/IR/chineseClassifier/function_builder.py", line 155, in get_classification_loss input_mask=inp_mask) File "/home/luban/IR/chineseClassifier/xlnet.py", line 222, in init ) = modeling.transformer_xl(tfm_args) File "/home/luban/IR/chineseClassifier/modeling.py", line 500, in transformer_xl dtype=tf_float) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 1560, in zeros shape = ops.convert_to_tensor(shape, dtype=dtypes.int32) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1050, in convert_to_tensor as_ref=False) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1146, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 971, in _autopacking_conversion_function return _autopacking_helper(v, dtype, name or "packed") File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 922, in _autopacking_helper constant_op.constant(elem, dtype=dtype, name=str(i))) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/constant_op.py", line 208, in constant value, dtype=dtype, shape=shape, verify_shape=verify_shape)) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/tensor_util.py", line 443, in make_tensor_proto nparray = np.array(values, dtype=np_dt) TypeError: int returned non-int (type NoneType)

What should I do? Thank you!

two steps: 1.add "while len(eval_examples) % FLAGS.predict_batch_size!=0: eval_examples.append(PaddingInputExample())" after "if FLAGS.do_predict:" 2.set "drop_remainder=True" in module of do_predict

abdullahkhilji commented 4 years ago
Traceback (most recent call last):
  File "run_classifier.py", line 857, in <module>
    tf.app.run()
  File "/home/abdullahkhilji/miniconda3/envs/xlnet/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "run_classifier.py", line 644, in main
    while len(eval_examples) % FLAGS.predict_batch_size!=0:
UnboundLocalError: local variable 'eval_examples' referenced before assignment