Open ftesser opened 11 years ago
To Check some strange behavior of Italian POS tagger:
The following two sentences:
S#1: quanti sono? S#2: Quanti sono?
differs only for the capitalization of the first char, and it gives totally different results:
S#1:
<?xml version="1.0" encoding="UTF-8"?> <maryxml xmlns="http://mary.dfki.de/2002/MaryXML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="0.5" xml:lang="it"> <p> <s> <phrase> <t accent="H+L*" g2p_method="lexicon" ph="' k w a1 n - t i" pos="B" pos_full="B"> quanti </t> <t accent="H+L*" g2p_method="lexicon" ph="' s O1 - n o" pos="V" pos_full="Vip3p"> sono </t> <t pos="FS" pos_full="FS"> ? </t> <boundary breakindex="5" tone="L-H%"/> </phrase> </s> </p> </maryxml> </pre>
S#2:
<?xml version="1.0" encoding="UTF-8"?> <maryxml xmlns="http://mary.dfki.de/2002/MaryXML" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="0.5" xml:lang="it"> <p> <s> <phrase> <t g2p_method="lexicon" ph="' k w a1 n - t i" pos="DQ" pos_full="DQmp"> Quanti </t> <t g2p_method="lexicon" ph="' s O1 - n o" pos="VA" pos_full="VAip3p"> sono </t> <t pos="FS" pos_full="FS"> ? </t> <boundary breakindex="5" tone="L-H%"/> </phrase> </s> </p> </maryxml>
To Check some strange behavior of Italian POS tagger:
The following two sentences:
S#1: quanti sono? S#2: Quanti sono?
differs only for the capitalization of the first char, and it gives totally different results:
S#1:
S#2: