OlegBaskov / language-learning

OpenCog unsupervised Language Learning project
https://wiki.opencog.org/w/Language_learning
MIT License
0 stars 1 forks source link

Enhance corpus statistics -- GL corpus_stats.txt #11

Closed OlegBaskov closed 6 years ago

OlegBaskov commented 6 years ago

https://github.com/singnet/language-learning/issues/67

OlegBaskov commented 6 years ago

Anton Kolonin 2018-07-19: I have edited spec and I think we can assume "link" == "germ link" in context of parses: Number of sentences: <number-of-sentences-across-all-parses> Average sentence length: <average-number-of-sentence-words-rounded> Number of unique words: <unique-words-across-all-parses> Total words count: <total-words-across-all-parses> Average per-word counts: <total-word-divided-by-unique-rounded> Number of unique links: <unique-links-across-all-parses> Total links count: <total-links-across-all-parses> Average per-link count: <total-links-divided-by-unique-rounded> Number of unique seeds: <unique-seeds-across-all-parses> Total seeds count: <total-seeds-across-all-parses> Average per-seed count: <total-seeds-divided-by-unique-rounded> Number of unique connectors: <unique-connectors-across-all-parses> Total connectors count: <total-connectors-across-all-parses> Average per-connector count: <total-connectors-divided-by-unique-rounded> Number of unique disjuncts: <unique-disjuncts-across-all-parses> Total disjuncts count: <total-disjuncts-across-all-parses> Average per-disjunct count: <total-disjuncts-divided-by-unique-rounded>

OlegBaskov commented 6 years ago

Pull request https://github.com/singnet/language-learning/pull/95