facebook / duckling

Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
Other
4.05k stars 723 forks source link

[FR] wrong numeral parsing for «mille cent» (1100) #619

Open nicolaspanel opened 3 years ago

nicolaspanel commented 3 years ago

I noticed what I suspect to be an incorrect behavior for the Numeral FR parser and I though you might be interested:

*Duckling.Debug> debug (makeLocale FR Nothing) "mille cent" [This Numeral]
powers of tens (mille)
-- regex (mille)
powers of tens (cent)
-- regex (cent)
[Entity {dim = "number", body = "mille", value = RVal Numeral (NumeralValue {vValue = 1000.0}), start = 0, end = 5, latent = False, enode = Node {nodeRange = Range 0 5, token = Token Numeral (NumeralData {value = 1000.0, grain = Just 3, multipliable = True, okForAnyTime = True}), children = [Node {nodeRange = Range 0 5, token = Token RegexMatch (GroupMatch ["mille"]), children = [], rule = Nothing}], rule = Just "powers of tens"}},Entity {dim = "number", body = "cent", value = RVal Numeral (NumeralValue {vValue = 100.0}), start = 6, end = 10, latent = False, enode = Node {nodeRange = Range 6 10, token = Token Numeral (NumeralData {value = 100.0, grain = Just 2, multipliable = True, okForAnyTime = True}), children = [Node {nodeRange = Range 6 10, token = Token RegexMatch (GroupMatch ["cent"]), children = [], rule = Nothing}], rule = Just "powers of tens"}}]

=> parsed as «1000» then «100» while «1100» (through ruleSum) is expected

NOTE: it works fine for 1200

*Duckling.Debug> debug (makeLocale FR Nothing) "mille deux cent" [This Numeral]
intersect 2 numbers (mille deux cent)
-- powers of tens (mille)
-- -- regex (mille)
-- compose by multiplication (deux cent)
-- -- number (0..16) (deux)
-- -- -- regex (deux)
-- -- powers of tens (cent)
-- -- -- regex (cent)
[Entity {dim = "number", body = "mille deux cent", value = RVal Numeral (NumeralValue {vValue = 1200.0}), start = 0, end = 15, latent = False, enode = Node {nodeRange = Range 0 15, token = Token Numeral (NumeralData {value = 1200.0, grain = Nothing, multipliable = False, okForAnyTime = True}), children = [Node {nodeRange = Range 0 5, token = Token Numeral (NumeralData {value = 1000.0, grain = Just 3, multipliable = True, okForAnyTime = True}), children = [Node {nodeRange = Range 0 5, token = Token RegexMatch (GroupMatch ["mille"]), children = [], rule = Nothing}], rule = Just "powers of tens"},Node {nodeRange = Range 6 15, token = Token Numeral (NumeralData {value = 200.0, grain = Just 2, multipliable = False, okForAnyTime = True}), children = [Node {nodeRange = Range 6 10, token = Token Numeral (NumeralData {value = 2.0, grain = Nothing, multipliable = False, okForAnyTime = True}), children = [Node {nodeRange = Range 6 10, token = Token RegexMatch (GroupMatch ["deux",""]), children = [], rule = Nothing}], rule = Just "number (0..16)"},Node {nodeRange = Range 11 15, token = Token Numeral (NumeralData {value = 100.0, grain = Just 2, multipliable = True, okForAnyTime = True}), children = [Node {nodeRange =
Range 11 15, token = Token RegexMatch (GroupMatch ["cent"]), children = [], rule = Nothing}], rule = Just "powers of tens"}], rule = Just "compose by multiplication"}], rule = Just "intersect 2 numbers"}}]