Open OlegBaskov opened 5 years ago
The problems start from cell 28, row 2 of the notebook (277 clusters), data -- .../GCB_LG-E-clean_dILEd_no-gen_0c_mwc=21/iteration_2:
the tagged category tree 277_cat_tree.txt.tagged should contain only tagged categories like ###aab###
, however starting from line 204 categories contain non-tagged words.
The same issue is observed in all the following cells 33-36.
Cluster tags and words in tagged grammar .dict and cat_tree files.
Either tagging or input parses filtering issue, OR issues in corpus preventing correct link extraction?
Jupyter notebook -- Iterative-clustering-ILE-POCE-CDS-2019-02-27.ipynb ⇒ static html copy.
This issue is a copy of https://github.com/OlegBaskov/language-learning/issues/79 issue.