senarvi / theanolm

TheanoLM is a recurrent neural network language modeling tool implemented using Theano
Apache License 2.0
81 stars 29 forks source link

Scoring error with --exclude-unk option #40

Closed sameerkhurana10 closed 5 years ago

sameerkhurana10 commented 6 years ago

hi,

when using the command theanolm score model.h5 test-data.txt --output perplexity --exclude-unk

i get the following error:

Mapped name None to device cuda: GeForce GTX TITAN Black (0000:03:00.0)
2018-04-08 15:40:02,166 exception_handler: An unexpected TypeError exception occurred: ufunc 'isinf' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Traceback will be written to debug log (enable with --log-level debug).
srun: error: sls-titan-0: task 0: Exited with exit code 2

without the option --exclude-unk, it gives me the ppl

srun -p gpu --gres=gpu:1 theanolm score exp/hsoftmax_transcript/nnlm.h5 data/rnnlm_data_all/test.dat --output perplexity              
2018-04-08 15:41:51,500 get_default_device: Context None device="GeForce GTX TITAN Black" ID="0000:03:00.0"
2018-04-08 15:41:51,503 from_file: Reading vocabulary from network state.
2018-04-08 15:41:53,979 from_file: Number of words in vocabulary: 205719
2018-04-08 15:41:53,980 from_file: Number of words in shortlist: 205718
2018-04-08 15:41:53,980 from_file: Number of word classes: 205718
2018-04-08 15:41:53,980 from_file: Building neural network.
2018-04-08 15:42:00,094 from_file: Restoring neural network state.
2018-04-08 15:42:00,326 score: Building text scorer.
2018-04-08 15:42:22,036 score: Scoring text.
Number of sentences: 5002
Number of words: 70173
Number of tokens: 70173
Number of predicted probabilities: 65171
Number of excluded (OOV) words: 0
Number of zero probabilities: 0
Cross entropy (base e): 8.218601900051507
Perplexity: 3709.3127655685576
/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using cuDNN version 6021 on context None
Mapped name None to device cuda: GeForce GTX TITAN Black (0000:03:00.0)

Any comments?

senarvi commented 6 years ago

Hi, I haven't seen this before. It's strange that it happens when you exclude UNKs from the perplexity computation. Could it mean that there are only UNKs in some sequence, resulting in sequence length 0 and an infinite value? To debug this further, you would need to enable debug logging (--log-level debug) to see where the exception occurs.

sameerkhurana10 commented 6 years ago

It is possible that the test sentence consists on only UNKs.

okay. Does the below tell us anything?

Using cuDNN version 6021 on context None
Mapped name None to device cuda: GeForce GTX TITAN Black (0000:03:00.0)
2018-04-08 17:46:49,062 exception_handler: An unexpected TypeError exception occurred: ufunc 'isinf' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Traceback will be written to debug log (enable with --log-level debug).
2018-04-08 17:46:49,062 exception_handler: Traceback:
2018-04-08 17:46:49,089 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/bin/theanolm", line 147, in <module>
    main()
2018-04-08 17:46:49,089 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/bin/theanolm", line 88, in main
    args.command_function(args)
2018-04-08 17:46:49,089 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/theanolm/commands/score.py", line 122, in score
    args.output_file, args.log_base, args.subwords, False)
2018-04-08 17:46:49,089 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/theanolm/commands/score.py", line 209, in _score_text
    num_zeroprobs += sum(numpy.isneginf(lp) for lp in merged_logprobs)
2018-04-08 17:46:49,089 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/theanolm/commands/score.py", line 209, in <genexpr>
    num_zeroprobs += sum(numpy.isneginf(lp) for lp in merged_logprobs)
2018-04-08 17:46:49,089 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/numpy/lib/ufunclike.py", line 34, in func
    return f(x, out=out, **kwargs)
2018-04-08 17:46:49,089 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/numpy/lib/ufunclike.py", line 202, in isneginf
    return nx.logical_and(nx.isinf(x), nx.signbit(x), out)
srun: error: sls-titan-0: task 0: Exited with exit code 2
senarvi commented 6 years ago

The exception is thrown by numpy.isneginf(lp) in commands/score.py, line 209. Apparently lp, which should be floating point (logprob) is something else. Can you debug this by printing the value? For example, before the line in question, add this:

for lp in merged_logprobs:
     print(lp)
     print(numpy.isneginf(lp))

and see what is the last value printed. Alternatively, if you give me the data and instructions how to reproduce the error, I can debug it myself.

sameerkhurana10 commented 6 years ago
-8.192791
False
-2.4449062
False
-6.7236123
False
-6.914638
False
-7.526195
False
None
2018-04-08 20:15:46,030 exception_handler: An unexpected TypeError exception occurred: ufunc 'isinf' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Traceback will be written to debug log (enable with --log-level debug).
2018-04-08 20:15:46,030 exception_handler: Traceback:
2018-04-08 20:15:46,039 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/bin/theanolm", line 147, in <module>
    main()
2018-04-08 20:15:46,039 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/bin/theanolm", line 88, in main
    args.command_function(args)
2018-04-08 20:15:46,040 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/theanolm/commands/score.py", line 122, in score
    args.output_file, args.log_base, args.subwords, False)
2018-04-08 20:15:46,040 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/theanolm/commands/score.py", line 210, in _score_text
    print(numpy.isneginf(lp))
2018-04-08 20:15:46,040 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/numpy/lib/ufunclike.py", line 34, in func
    return f(x, out=out, **kwargs)
2018-04-08 20:15:46,040 exception_handler: File "/data/sls/u/sameerk/anaconda3/envs/theano-lm/lib/python3.5/site-packages/numpy/lib/ufunclike.py", line 202, in isneginf
    return nx.logical_and(nx.isinf(x), nx.signbit(x), out)
srun: error: sls-titan-0: task 0: Exited with exit code 2
sameerkhurana10 commented 6 years ago

you can find the necessary files here:

http://people.csail.mit.edu/sameerk/for_theanolm/

check out the README.txt at the location for details. Let me know if you need more info.

Thanks for the help.

senarvi commented 6 years ago

Thanks! I think you have to add a check here:

https://github.com/senarvi/theanolm/blob/master/theanolm/commands/score.py#L209

that the logporb is not None i.e. UNK when computing the number of zero-probabilities. Try this:

num_zeroprobs += sum((lp is not None) and numpy.isneginf(lp) for lp in merged_logprobs)
sameerkhurana10 commented 6 years ago

working now, thanks

senarvi commented 6 years ago

Do you want to create a pull request so that I can merge the bug fix?

senarvi commented 6 years ago

I commited the change to the develop branch. Can you pull that branch and check that it works for you?