Open ebarbot opened 3 years ago
You're right. It is completely wrong. But there is not enough information to debug. Could you try the command line version (analyzeText), please and paste (or attach) here both the text and all the console output?
As you can see, the results are quite better under Linux:
# sent_id = 1
# text = February 23 - A revolt against the government of King Joseph I of Portugal takes place in the city of Oporto.
1 February February PROPN _ NUMBER=SING _ _ _ NE=DateTime.DATE|Pos=1|Len=8
2 23 23 NUM _ _ _ _ _ NE=DateTime.DATE|Pos=10|Len=2
3 - - COLON _ _ 3 Dummy _ Pos=13|Len=1
4 A a DET _ _ 4 det _ Pos=15|Len=1
5 revolt revolt NOUN _ NUMBER=SING 13 SUJ_V _ Pos=17|Len=6
6 against against ADP _ _ 7 PREPSUB _ Pos=24|Len=7
7 the the DET _ _ 7 det _ Pos=32|Len=3
8 government government NOUN _ NUMBER=SING 4 COMPDUNOM _ Pos=36|Len=10
9 of of ADP _ _ 10 PREPSUB _ Pos=47|Len=2
10 King king NOUN _ NUMBER=SING 10 ADJPRENSUB _ Pos=50|Len=4
11 Joseph Joseph PROPN _ NUMBER=SING _ _ _ NE=Person.PERSON|Pos=55|Len=6
12 I I PRON _ _ _ _ _ NE=Person.PERSON|Pos=62|Len=1
13-14 joseph _ _ _ _ _ _ _ _
13 of of ADP _ _ 12 PREPSUB _ Pos=64|Len=2
14 Portugal Portugal PROPN _ NUMBER=SING _ _ _ NE=Location.LOCATION|Pos=67|Len=8
15 takes take VERB _ _ 0 _ _ Pos=76|Len=5
16 place place NOUN _ NUMBER=SING 13 COD_V _ Pos=82|Len=5
17 in in ADP _ _ 17 PREPSUB _ Pos=88|Len=2
18 the the DET _ _ 17 det _ Pos=91|Len=3
19 city city NOUN _ NUMBER=SING 14 COMPDUNOM _ Pos=95|Len=4
20 of of ADP _ _ 19 PREPSUB _ Pos=100|Len=2
21 Oporto Oporto PROPN _ NUMBER=SING _ _ _ NE=Location.LOCATION|Pos=103|Len=6
22 . . SENT _ _ 0 _ _ Pos=109|Len=1
We need more information to understand what happens under Windows.
I get this, I don't know if I am supposed to set something to print more logs ?
H:\test_lima_windows>analyzeText -l eng joseph_I.txt Analyzing 1/1 (100.00%) 'joseph_I.txt'# global.columns = ID FORM LEMMA UPOS XPOS FEATS HEAD DEPREL DEPS MISC # sentid = 1 # text = February 23 - A revolt against the government of King Joseph I of Portugal takes place in the city of Oporto. 1 February February PROPN NUMBER=SING NE=DateTime.DATE|Pos=1|Len=8 2 23 23 NUM NE=DateTime.DATE|Pos=10|Len=2 3 - - COMMA 3 Dummy Pos=13|Len=1 4 A A PROPN NUMBER=SING 4 ADJPRENSUB Pos=15|Len=1 5 revolt revolt NOUN NUMBER=SING 5 ADJPRENSUB Pos=17|Len=6 6 against against NOUN NUMBER=SING 6 ADJPRENSUB Pos=24|Len=7 7 the the NOUN NUMBER=SING 7 ADJPRENSUB Pos=32|Len=3 8 government government NOUN NUMBER=SING 8 ADJPRENSUB Pos=36|Len=10 9 of of NOUN NUMBER=SING 10 ADJPRENSUB Pos=47|Len=2 10 King King PROPN NUMBER=SING 10 SUBSUBJUX Pos=50|Len=4 11 Joseph Joseph PROPN NUMBER=SING NE=Person.PERSON|Pos=55|Len=6 12 I i NUM NUMBER=SING NE=Person.PERSON|Pos=62|Len=1 13 of of NOUN NUMBER=SING 12 ADJPRENSUB Pos=64|Len=2 14 Portugal Portugal PROPN NUMBER=SING NE=Location.LOCATION|Pos=67|Len=8 15 takes takes NOUN NUMBER=SING 14 ADJPRENSUB Pos=76|Len=5 16 place place NOUN NUMBER=SING 15 ADJPRENSUB Pos=82|Len=5 17 in in NOUN NUMBER=SING 16 ADJPRENSUB Pos=88|Len=2 18 the the NOUN NUMBER=SING 17 ADJPRENSUB Pos=91|Len=3 19 city city NOUN NUMBER=SING 18 ADJPRENSUB Pos=95|Len=4 20 of of NOUN NUMBER=SING 19 ADJPRENSUB Pos=100|Len=2 21 Oporto Oporto PROPN NUMBER=SING NE=Location.LOCATION|Pos=103|Len=6 22 . . SENT 0 _ Pos=109|Len=1
@victorbocharov , you are the last developer having ensured a successful Windows build. Have you noticed problems like that ?
No, I haven't. Moreover, I don't have Windows computers, so I won't be able to reproduce this. I can only suggest a few guesses:
Looks like English dictionary isn't used or it is empty. @kleag : How to check this? @ebarbot : Is the pipeline "main" unchanged? @ebarbot : How old is the version of LIMA?
I downloaded the 3.0.0.20210912222206-0c3404de version, and if I explicitely write analyzeText -l eng -p main joseph_I.txt
I get the same result
(everything is a noun)