UKPLab / emnlp2017-bilstm-cnn-crf

BiLSTM-CNN-CRF architecture for sequence tagging
Apache License 2.0
824 stars 263 forks source link

How to use this on domain specific English language dataset #12

Open duesXMachine opened 6 years ago

duesXMachine commented 6 years ago

I have been trying to develop ner for domain specific English dataset.How to disable those german embeddings?

nreimers commented 6 years ago

Hi @duesXMachine the code has a variable 'embeddingsPath' where you specify the path to the embeddings file. In Train_Chunking.py it is in line 39 an specifies 'levy_deps.words' as path, which are the dependency-based embeddings by Levy et al.

You can specify any embeddings that is in the suitable text format (one embedding per line, line starts by the word followed by the float-point values for the embedding)

duesXMachine commented 6 years ago

Thanks @nreimers for this quick answer.

duesXMachine commented 6 years ago

Hey @nreimers can I use word2vec format for embeddings.I am thinking of creationg word embedding in binary or text format using Genism.

nreimers commented 6 years ago

Hi @duesXMachine the binary format is not working. It must be in a text format like the embeddings from Levy et al. (https://levyomer.wordpress.com/2014/04/25/dependency-based-word-embeddings/):

word1 0.34 0.41 0.71
word2 0.12 0.34 0.33
...

One word per line separated by a white space the word and the individual dimensions for embedding.

When you have trained embeddings with Gensim, you can easily store it in this format and then use it for the BiLSTM-CRF architecture

duesXMachine commented 6 years ago

I generated embeddings using Genism in text format and did 'charEmbeddings': None.But while running RunModel.py I am getting this error

/home/deusxmachine/.local/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Traceback (most recent call last):
  File "RunModel.py", line 22, in <module>
    lstmModel.loadModel(modelPath)
  File "/home/deusxmachine/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py", line 582, in loadModel
    self.maxCharLen = int(f.attrs['maxCharLen'])
ValueError: invalid literal for int() with base 10: 'None'
nreimers commented 6 years ago

You can try and change line 581 to:

if 'maxCharLen' in f.attrs and f.attrs['maxCharLen'] is not None:
duesXMachine commented 6 years ago

I tried that..actually f.attrs['maxCharLen'] returns 'None' not None.Its a string.Even if I handle that by doing: if 'maxCharLen' in f.attrs and f.attrs['maxCharLen'] != 'None':

I get error again:

Using TensorFlow backend.
/home/deusxmachine/.local/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
> /home/deusxmachine/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py(578)loadModel()
-> mappings = json.loads(f.attrs['mappings'])
(Pdb) n
> /home/deusxmachine/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py(579)loadModel()
-> if 'additionalFeatures' in f.attrs:
(Pdb) n
> /home/deusxmachine/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py(580)loadModel()
-> self.additionalFeatures = json.loads(f.attrs['additionalFeatures'])
(Pdb) n
> /home/deusxmachine/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py(582)loadModel()
-> if 'maxCharLen' in f.attrs and f.attrs['maxCharLen'] != 'None':
(Pdb) n
> /home/deusxmachine/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py(585)loadModel()
-> self.model = model
(Pdb) c
Traceback (most recent call last):
  File "RunModel.py", line 22, in <module>
    lstmModel.loadModel(modelPath)
  File "/home/deusxmachine/.local/lib/python2.7/site-packages/nltk/tokenize/__init__.py", line 93, in sent_tokenize
    tokenizer = load('tokenizers/punkt/{0}.pickle'.format(language))
  File "/home/deusxmachine/.local/lib/python2.7/site-packages/nltk/data.py", line 808, in load
    opened_resource = _open(resource_url)
  File "/home/deusxmachine/.local/lib/python2.7/site-packages/nltk/data.py", line 926, in _open
    return find(path_, path + ['']).open()
  File "/home/deusxmachine/.local/lib/python2.7/site-packages/nltk/data.py", line 648, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource u'tokenizers/punkt/english.pickle' not found.  Please
  use the NLTK Downloader to obtain the resource:  >>>
  nltk.download()
  Searched in:
    - '/home/deusxmachine/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - u''
**********************************************************************
nreimers commented 6 years ago

As the error mentions, you must download the NLTK models for the english pickle

Run:

python -m nltk.downloader -d /home/deusxmachine/nltk_data punkt
duesXMachine commented 6 years ago

Yes my bad I did read it properly.Thanks @nreimers. And change that: if 'maxCharLen' in f.attrs and f.attrs['maxCharLen'] != 'None':

duesXMachine commented 6 years ago

Hey @nreimers I enabled charEmbeddings but for some reason its taking too much memory and ends up crashing my server.I am using 4 V-cores and 15 GB RAM + 10 GB swap memory.

nreimers commented 6 years ago

Hi @duesXMachine How much RAM is used without charEmbeddings?

For using the charEmbeddings, all words are padded to the longest word in the corpus. If you have one really long word in your data, for example, an URL, all words might be padded to 200 characters. This of course can require some memory.

Maybe you can check the variable of maxCharLen. If it is too big, it might be wise to limit the length of the words, e.g. to 30 characters. Words longer than that must be truncated.

duesXMachine commented 6 years ago

Hey @nreimers before I was using 4GB RAM on my PC without charEmbeddings and it worked.I think you are right I might have large urls in my dataset as words.And how are you encoding words to 1 D vectors for 1D convolution input?Which encoding?

My maxCharLen is 54

duesXMachine commented 6 years ago

Hey @nreimers to re-train a model all I am doing is loading it and then training.Is there something to change or to keep in mind while re-training it.

nreimers commented 6 years ago

Hi @duesXMachine . Loading the model and continue the training is fine, no need to change something after loading.

duesXMachine commented 6 years ago

Thanks @nreimers ..is there some function to compute confidence

nreimers commented 6 years ago

Hi @duesXMachine . What do you mean with confidence?

When you use a softmax as a classifier, the value can be interpreted as probability / confidence for the different available tags. For the CRF-Classifier, this computation is much more difficult. The scores for the different taggings would needed to be computed an compared.

However, I find the confidence values computed by a network often meaningless. For error cases we often see really high confidence values of >99%, same for difficult instances, we see that the network often has a really high confidence. Sadly the confidence / probability returned by softmax is not a good approximator how likely the label is correct or how easy / difficult the word was to tag. So in most application scenarios this value is not really useful.

duesXMachine commented 6 years ago

Hey @nreimers So is there any other work around to get probability/confidence. Actually I need to check network's confidence if its greater or less than certain threshold, just to check when its reliable and when not.

nreimers commented 6 years ago

The easiest way is to use softmax as a classifier. In line 138 of BiLSTM.py you find this line:

predictions = self.model.predict([inputData[name] for name in features], verbose=False)
predictions = predictions.argmax(axis=-1) #Predict classes 

The first self.model.predict(...) predicts the probabilities for the different labels and predictions.argmax(...) transforms this to the concrete label.

You can store self.model.predict(...) in a variable an inspect if it gives any meaningful numbers. Not sure how the values in self.model.predict(...) look for a CRF classifier.

duesXMachine commented 6 years ago

Okay thanks @nreimers will look into it.Another thing whats that casing embedding layer?How useful is it?

nreimers commented 6 years ago

Most pre-trained word embeddings only store information about lower cased words, i.e., the information of the casing of a word gets lost. The casing layer provides the information about the casing of the word, e.g., all uppercase, initial character is upper case, all lowercase etc.

This is especially useful for NER, were casing provides a lot of information. However, in noisy data, the casing of words can be wrong. This can cause problems for many models, e.g. in a sentence that IS ALL UPPERCASE, many NER models output that all words are named entities (because they were spelled in uppercase letters). In that case, it is better to remove the casing layer. The performance on standard NLP dataset will drop, but the system will work much better on noisy data.

duesXMachine commented 6 years ago

Right now I am working on a dataset with common casing i.e all upper case.Should I just remove that casing embedding layer or there is an option to disable it

nreimers commented 6 years ago

If everything has the same casing, the layer will do no harm. Only issue is if the training data has a correct casing, while your real data might have wrong casing (e.g. training correctly cased while test/real data can have all lower cased / all upper cased).

Currently there is no easy option to disable it, you would need to update the BiLSTM.py and remove the layer by hand from the network (or add an option for removing it to the network)

duesXMachine commented 6 years ago

Hey @nreimers I was thinking is there a way to add a start and end sentence token like <START><END> to lstms. Actually in my sentence there are multiple sentences.I cant seperate them all cause they are dependent.Is there a way to achieve this or I have create a new network.

nreimers commented 6 years ago

LSTMs (and RNNs in general) often have issues to encode long range dependencies. So I'm not sure if the network (or any network) is able to figure out the dependencies between sentences. But you can try.

You can enrich your train/dev/test data and add a special token (like and ) to your sentences and then train the network.

duesXMachine commented 6 years ago

Hey @nreimers I tried reloading the model to re-train it but got error:

--------- Epoch 1 -----------
Traceback (most recent call last):
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1021, in _do_call
    return fn(*args)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1003, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,186] = 37451 is not in [0, 37444)
         [[Node: Gather = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](token_emd_W/read, _recv_embedding_input_1_0)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Train_NER_German.py", line 90, in <module>
    model.evaluate(50)
  File "/home/prashantsharma2476/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py", line 391, in evaluate
    self.trainModel()
  File "/home/prashantsharma2476/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py", line 107, in trainModel
    self.model.train_on_batch(nnInput, labels)   
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/models.py", line 766, in train_on_batch
    class_weight=class_weight)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/engine/training.py", line 1320, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 1943, in __call__
    feed_dict=feed_dict)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 766, in run
    run_metadata_ptr)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 964, in _run
    feed_dict_string, options, run_metadata)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1014, in _do_run
    target_list, options, run_metadata)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1034, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,186] = 37451 is not in [0, 37444)
         [[Node: Gather = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](token_emd_W/read, _recv_embedding_input_1_0)]]

Caused by op 'Gather', defined at:
  File "Train_NER_German.py", line 84, in <module>
    model.loadModel(sys.argv[1])
  File "/home/prashantsharma2476/emnlp2017-bilstm-cnn-crf/neuralnets/BiLSTM.py", line 574, in loadModel
    model = keras.models.load_model(modelPath, custom_objects=create_custom_objects())
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/models.py", line 142, in load_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/models.py", line 193, in model_from_config
    return layer_from_config(config, custom_objects=custom_objects)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/utils/layer_utils.py", line 42, in layer_from_config
    return layer_class.from_config(config['config'])
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/models.py", line 1079, in from_config
    merge_input = layer_from_config(merge_input_config)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/utils/layer_utils.py", line 42, in layer_from_config
    return layer_class.from_config(config['config'])
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/models.py", line 1086, in from_config
    model.add(layer)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/models.py", line 299, in add
    layer.create_input_layer(batch_input_shape, input_dtype)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/engine/topology.py", line 401, in create_input_layer
    self(x)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/engine/topology.py", line 572, in __call__
    self.add_inbound_node(inbound_layers, node_indices, tensor_indices)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/engine/topology.py", line 635, in add_inbound_node
    Node.create_node(self, inbound_layers, node_indices, tensor_indices)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/engine/topology.py", line 166, in create_node
    output_tensors = to_list(outbound_layer.call(input_tensors[0], mask=input_masks[0]))
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/layers/embeddings.py", line 128, in call
    out = K.gather(W, x)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 960, in gather
    return tf.gather(reference, indices)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1293, in gather
    validate_indices=validate_indices, name=name)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 759, in apply_op
    op_def=op_def)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
    self._traceback = _extract_stack()
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2240, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1128, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): indices[0,186] = 37451 is not in [0, 37444)
         [[Node: Gather = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](token_emd_W/read, _recv_embedding_input_1_0)]]

Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7f2c054c9e10>>
Traceback (most recent call last):
  File "/home/prashantsharma2476/bilstm/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 581, in __del__
UnboundLocalError: local variable 'status' referenced before assignment
nreimers commented 6 years ago

Hi @duesXMachine I sadly haven't seen this issue before. Looks like some internal problem with tensorflow?

The code works best if theano is used as backend.

I'm currently working on converting the code to Keras 2 & Tensorflow, hopefully I can finish that soon.

duesXMachine commented 6 years ago

Hey @nreimers I re-trained the model using the following code:

if len(sys.argv) == 2:
    print('Loading Pre-Trained model::'+sys.argv[1])
    model = BiLSTM(params)
    model.loadModel(sys.argv[1])
    model.setMappings(embeddings, data['mappings'])
    model.setTrainDataset(data, labelKey)
    model.verboseBuild = True
    model.modelSavePath = "models/%s/%s/[DevScore]_[TestScore]_[Epoch].h5" % (datasetName, labelKey)  # Enable this line to save the model to the disk
    model.evaluate(50)
else:

    model = BiLSTM(params)
    model.setMappings(embeddings, data['mappings'])
    model.setTrainDataset(data, labelKey)
    model.verboseBuild = True
    model.modelSavePath = "models/%s/%s/[DevScore]_[TestScore]_[Epoch].h5" % (datasetName, labelKey) #Enable this line to save the model to the disk
    model.evaluate(50)

Now when I load the model for testing I got the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/home/deusxmachine/emailanalyzer/src/main/python/analytics/analyzer.py", line 100, in run
    self.predict()
  File "/home/deusxmachine/emailanalyzer/src/main/python/analytics/analyzer.py", line 25, in predict
    self.bilstm = Model(BILSTM_MODEL)
  File "/home/deusxmachine/emailanalyzer/src/main/python/analytics/bilstm/model.py", line 18, in __init__
    self.lstmModel.loadModel(modelPath)
  File "/home/deusxmachine/emailanalyzer/src/main/python/analytics/bilstm/neuralnets/BiLSTM.py", line 576, in loadModel
    model = keras.models.load_model(modelPath, custom_objects=create_custom_objects())
  File "/home/deusxmachine/email/lib/python3.6/site-packages/keras/models.py", line 142, in load_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File "/home/deusxmachine/email/lib/python3.6/site-packages/keras/models.py", line 193, in model_from_config
    return layer_from_config(config, custom_objects=custom_objects)
  File "/home/deusxmachine/email/lib/python3.6/site-packages/keras/utils/layer_utils.py", line 42, in layer_from_config
    return layer_class.from_config(config['config'])
  File "/home/deusxmachine/email/lib/python3.6/site-packages/keras/models.py", line 1090, in from_config
    layer = get_or_create_layer(conf)
  File "/home/deusxmachine/email/lib/python3.6/site-packages/keras/models.py", line 1069, in get_or_create_layer
    layer = layer_from_config(layer_data)
  File "/home/deusxmachine/email/lib/python3.6/site-packages/keras/utils/layer_utils.py", line 35, in layer_from_config
    instantiate=False)
  File "/home/deusxmachine/email/lib/python3.6/site-packages/keras/utils/generic_utils.py", line 125, in get_from_module
    str(identifier))
ValueError: Invalid layer: ClassWrapper

When I load a new model(which has not been re-trained) I don't get any such error.Does it has something to do with casing embedding or I re-trained it wrong.

nreimers commented 6 years ago

Sadly I don't know why this error happens.

Looks like that keras is not able to read the information about the configuration of the network, maybe the serialization is broken after loading and saving the model again?

I would recommend to try it with a softmax classifier. The CRF modul is a custom layer, maybe the issue is related to CRF that storing - loading - storing does not work for custom layers.

nreimers commented 6 years ago

Hi @duesXMachine I found the issue. The CRF layer is no longer names CRF layer if you load & store the model. To fix it, you must update /neuralnets/keraslayers/ChainCRF.py

Change the return value of create_custom_objects(): to:

return {'ChainCRF': ClassWrapper, 'ClassWrapper': ClassWrapper, 'loss': loss, 'sparse_loss': sparse_loss}

I just released a new (improved) version of this code that works with Keras 2.1.5 and Tensorflow 1.7.0. In that version, this bug is fixed.

duesXMachine commented 6 years ago

Hey @nreimers , So now I can retrain the model while using CRF Layer.

nreimers commented 6 years ago

@duesXMachine Yes, it should work

duesXMachine commented 6 years ago

@nreimers Training process is taking too much memory around 14GB with charEmbeddings : None and maxCharLength: 10

nreimers commented 6 years ago

How large is your uncompressed word embeddings file?

duesXMachine commented 6 years ago

Its 62.1 MB

nreimers commented 6 years ago

Then I sadly don't know what the issue is, the model should be far smaller and it should not require 14GB memory.

duesXMachine commented 6 years ago

Hey @nreimers two quick questions.Here you go

  1. While training ner if I give a word with label that is not present in word_embedding then what happens is it converted to <UNKNOWN> token?

  2. While testing model file, if a new word is given to model which might not be present in model mapping dict then what happens and how does the model predicts label for that word?

Thanks in advance.:)

nreimers commented 6 years ago

Hi @duesXMachine

Unknown words are mapped to the UNKNOWN token. For these unknown word, labels are still inferred, however, the system does not see the word.

If you have a sentence like: Mark is the founder of Facebook

And assuming it would not know Facebook, then the sentence is transferred to: Mark is the founder of UNKNOWN

It would still try to guess if UNKNOWN is a named entity or not.

When you enable character-based word representations, then a word representation for Facebook would be derived from the characters.

duesXMachine commented 6 years ago

Thanks @nreimers Can you tell me what does CharEmbeddingsSize does?

nreimers commented 6 years ago

Hi @duesXMachine For the char based word representations, every character is mapped to an embedding (of size CharEmbeddingsSize), then an LSTM or CNN is used to derive an embedding for the token

duesXMachine commented 6 years ago

Hey @nreimers I tried re-training a model with CharEmbedding : 'CNN' and got the following error with CNN input layer dimension.

ValueError: Error when checking input: expected char_input to have 3 dimensions, but got array with shape (2, 136)
nreimers commented 6 years ago

Hi @duesXMachine There was a bug that the characters of the word were not transformed to vectors when the model was loaded instead of build from scratch.

I pushed a bugfix to the BiLSTM.py file.

I also updated the dependencies so that it works with Keras 2.2.0 and Tensorflow 1.8.0.

There was a change in Keras/Tensorflow, so that CNN based character-word-representation do no longer support masking. Hence, old models that where trained with CharEmbedding: 'CNN' can not be loaded with the latest version.

ZhaofengWu commented 6 years ago

I cloned a new version of this repo, but @duesXMachine's problem (of the error when checking char_input) is still occurring. To be more specific, I cloned this new version, trained on CoNLL-2003 with CNN char embedding using TensorFlow backend (by default), and ran a modified RunModel_CoNLL_Format.py:

inputColumns = {0: "tokens", 1: 'NER_BIO'}

# :: Prepare the input ::
sentences = readCoNLL(inputPath, inputColumns)
addCharInformation(sentences)
addCasingInformation(sentences)

# :: Load the model ::
lstmModel = BiLSTM.loadModel(modelPath)

dataMatrix = createMatrices(sentences, lstmModel.mappings, True)

print(lstmModel.computeF1(list(lstmModel.models.keys())[0], dataMatrix))

The last line will produce that exact same error.

However, if I download the pre-trained CoNLL-2003 English model, there is no such problem.

nreimers commented 6 years ago

Okay that is strange. I tested it with the German models, which use charEmbeddings: CNN and they work. I tested it with RunModel.py, might be that there is a slight difference to the CoNLL.py file.

I will check it next week when I'm back in office

ZhaofengWu commented 6 years ago

Thanks! Note that there is no problem if I do the regular training + RunModel.py with the trained model. The crash occurs only when I use the modified code snippet, shown above. The only changed lines are the first and the last line. So I suspect that it probably isn't the problem for different languages, but rather in the computeF1() function.

duesXMachine commented 6 years ago

Hey @nreimers sometime while testing model I get this exception

Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7f24f4cf4c88>>
Traceback (most recent call last):
  File "/home/deusxmachine/venvs/emailanalyzer/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 712, in __del__
  File "/home/deusxmachine/venvs/emailanalyzer/lib/python3.5/site-packages/tensorflow/python/framework/c_api_util.py", line 31, in __init__
TypeError: 'NoneType' object is not callable

What does that means?

nreimers commented 6 years ago

It looks like tensorflow session is not initialized. But I am not sure why this is the case. Maybe some issue with keras or tensorflow.

@zhaofengwu I will check that next week and come back to you

ZhaofengWu commented 6 years ago

By the way, if I add lstmModel.tagSentences(dataMatrix) above my final line pasted above (which is print(lstmModel.computeF1(list(lstmModel.models.keys())[0], dataMatrix))), it will work fine. Hope it could help you find the problem.