Closed GoogleCodeExporter closed 8 years ago
When I run this test (on an XP machine), it works fine. Which lexicon are you
using? Have you modified it?
Original comment by ehud.rei...@gmail.com
on 9 May 2011 at 4:35
We did remove some incomplete categories from the Lexicon. But when I run the
query on the lexicon
select * from lex_record where base like 'man'
I get 3 rows
MAN noun
man noun
man verb
So I believe it is picking up the first entry which is an acronym. Is there a
relationship in lexicon that needs to be preserved which dictates the order of
the output.
Original comment by manish.s...@gmail.com
on 22 May 2011 at 6:57
Adding to the above, we are using MS SQL server.
Original comment by manish.s...@gmail.com
on 22 May 2011 at 6:58
I think the problem is due to the fact that HSQL (default DB for NIH Specialist
Lexicon) by default does case-sensitive matching, while most other DBs by
default do case-insensitive matching.
I have committed a change to the lexicon class (under the source tab) which
hopefully should fix this, but I can't test it since I use HSQL. Could you
test this (you'll need to download and compile the source) and let me know if
it solves the problem
Another alternative would be to change the design of the lexicon table in MS
SQL. I don't know MS SQL well, but most DBs allow case
sensitivity/insensitivity to be specified in a column as part of the design of
the table
Original comment by ehud.rei...@gmail.com
on 25 May 2011 at 8:21
Thanks for the change. It solved the issue with NounPhraseTest.java. However,
in the testStringRecognition in ClauseTest.java I am getting "my cat is SAD"
instead of "my cat is sad". Debugging into code, it seems sad is being
recognized as a NOUN here and when fetching the lexical records from NIH db
"sad" is Adj and "SAD" is a Noun, so it picks "SAD" the noun. I am referring to
the function getWordsFromLexResult() which fetches the lex records and compares
against the category.
Original comment by manish.s...@gmail.com
on 6 Jun 2011 at 3:42
Since we are not testing simplenlg with the lexicon held in an MS SQL DB, I
suspect this kind of issue will keep on arising. I've discussed this with the
specialist lexicon people, their advice (which I agree with) is to set up MS
SQL (at the DB or table level) to do case-sensitive matching for the lexicon
Original comment by ehud.rei...@gmail.com
on 8 Jun 2011 at 2:32
Original issue reported on code.google.com by
manish.s...@gmail.com
on 6 May 2011 at 12:40