jacksonllee / pylangacq

Language Acquisition Research Tools
https://pylangacq.org
MIT License
37 stars 18 forks source link

IPSyn level calculation is not accurate #15

Closed dcanones closed 3 years ago

dcanones commented 3 years ago

Hi!

It seems like the IPSyn level calculation is not accurate. It takes punctuation marks (e.g. [.]) as a full valid word, triggering some of the IPSyn rules and increasing its value artificially.

jacksonllee commented 3 years ago

Hello, do you have more concrete examples and details (e.g., which CHILDES dataset you used, which IPSyn rules)?

jacksonllee commented 3 years ago

I've just looked at the IPSyn code. All the 56 items are based on the dependency graph of a given utterance and rely on morphological and syntactic/semantic information (e.g., part-of-speech tags), and there doesn't seem to be anything related to punctuation marks.

I'm closing this ticket for now -- we can reopen it if further info like which IPSyn items specifically are problematic and/or relevant CHILDES data is available for debugging.