jayurbain / mitlm

Automatically exported from code.google.com/p/mitlm
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Warning during interpolation: "Search direction is not a descent direction" #22

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Hi Paul,

I've experienced some problems with the linear interpolation process and I am 
not sure how to solve the problem. When I try to interpolate a 3-component LM I 
get the warning:

--------
THE SEARCH DIRECTION IS NOT A DESCENT DIRECTION

IFLAG= -1
LINE SEARCH FAILED. SEE DOCUMENTATION OF ROUTINE MCSRCH
ERROR RETURN OF LINE SEARCH: INFO=  0
POSSIBLE CAUSES: FUNCTION OR GRADIENT ARE INCORRECT
OR INCORRECT TOLERANCES
--------

What I did was creating three single language models using

estimate-ngram -order 3 -v wlist -unk true -t train1.txt -opt-perp opt1.txt -wl 
arpa_a.gz
estimate-ngram -order 3 -v wlist -unk true -t train2.txt -opt-perp opt2.txt -wl 
arpa_b.gz
estimate-ngram -order 3 -v wlist -unk true -t train3.txt -opt-perp opt3.txt -wl 
arpa_c.gz

then unpacked the components

gzip -d -f arpa_a.gz
gzip -d -f arpa_b.gz
gzip -d -f arpa_c.gz

then interpolated using

interpolate-ngram -l "arpa_a,arpa_b,arpa_c" -opt-perp int-opt.txt -wl arpa_full

The process actually yields in a language model I can use afterwards but I am 
not sure what the warning/error is about and what I have to do to fix this.

My system is a Linux 2.6.31.12-0.2-desktop x86_64 with 8 GB ram and a quad-core 
AMD 2360SE

Thanks in advance!

Original issue reported on code.google.com by sebastia...@googlemail.com on 25 Oct 2010 at 12:29

GoogleCodeExporter commented 8 years ago
Add:

I get the same message:

--------
 IFLAG= -1 
 LINE SEARCH FAILED. SEE DOCUMENTATION OF ROUTINE MCSRCH
 ERROR RETURN OF LINE SEARCH: INFO=  3                  
 POSSIBLE CAUSES: FUNCTION OR GRADIENT ARE INCORRECT    
 OR INCORRECT TOLERANCES 
--------

when I use different data sets and count merging for interpolation. I used the 
following commands:

estimate-ngram -order 3 -v wlist -unk true -t train.txt -opt-perp opt.txt -wl 
arpa_a.gz -wec arpa_a.effcounts
...

gzip -d -f arpa_a.gz
...

interpolate-ngram -l "arpa_a, arpa_b, arpa_c" -interpolation CM -opt-perp 
ip.txt -wl arpa_full.gz

Original comment by sebastia...@googlemail.com on 25 Oct 2010 at 2:55

GoogleCodeExporter commented 8 years ago
These error messages from Powell's search algorithm suggest that the search 
surface may be so flat that the changes in the interpolation weights resulted 
in only random fluctuations due to numerical precision limitations.  As you 
pointed out, the resulting weights are still usable and thus generated a valid 
LM.  It should be okay to ignore these warnings.

In this particular case, are the component LMs trained from small text corpora 
or with little overlap in vocabulary?  Is the tuning set particularly small?  
If so, this may explain why the interpolated model is not sensitive to the 
interpolation parameters.  To test this hypothesis, you can try specifying the 
-params argument in interpolation-ngram to test the perplexity of the tuning 
data with different interpolation weights.

Original comment by bojune...@gmail.com on 27 Oct 2010 at 3:25