Franck-Dernoncourt / NeuroNER

Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.
http://neuroner.com
MIT License
1.69k stars 476 forks source link

No CRF and gradient clipping produces an error #50

Open fsonntag opened 7 years ago

fsonntag commented 7 years ago

When setting use_crf = False and turning on gradient clipping, the following error is thrown:

  File "/Users/Felix/Developer/NeuroNER/src/main.py", line 250, in <module>
    main()
  File "/Users/Felix/Developer/NeuroNER/src/main.py", line 245, in main
    nn = NeuroNER(**arguments)
  File "/Users/Felix/Developer/NeuroNER/src/neuroner.py", line 280, in __init__
    model = EntityLSTM(dataset, parameters)
  File "/Users/Felix/Developer/NeuroNER/src/entity_lstm.py", line 214, in __init__
    self.define_training_procedure(parameters)
  File "/Users/Felix/Developer/NeuroNER/src/entity_lstm.py", line 233, in define_training_procedure
    for grad, var in grads_and_vars]
  File "/Users/Felix/Developer/NeuroNER/src/entity_lstm.py", line 233, in <listcomp>
    for grad, var in grads_and_vars]
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/ops/clip_ops.py", line 55, in clip_by_value
    t = ops.convert_to_tensor(t, name="t")
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 639, in convert_to_tensor
    as_ref=False)
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 704, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 113, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/constant_op.py", line 102, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/usr/local/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 360, in make_tensor_proto
    raise ValueError("None values not supported.")
ValueError: None values not supported.

This happens as the gradient in the grads_and_vars variable for the CRF layer is None. A possible workaround is changing the line with setting the gradient clipping to:

grads_and_vars = [(tf.clip_by_value(grad, -parameters['gradient_clipping_value'], parameters['gradient_clipping_value']), var) 
                              for grad, var in grads_and_vars if grad is not None]

Nevertheless I'm not sure if that's a valid workaround or if it will break the model somehow...

azirikly commented 6 years ago

I was wondering if you received from Franck a confirmation that this is a valid workaround. Thank you for sharing!

fsonntag commented 6 years ago

Nope, sorry.

lovychen commented 6 years ago

hello,I also have this problem, I want to know, how to solve?