Closed akolonin closed 5 years ago
It seems 400 clusters is invalid setting for GL with CDS corpus. Although test is ready to be uploaded to the repo, still need to make a decission on the number of clusters to be used. The test works fine with 50 clusters setting. Ready to make PR if that is ok.
We need to A) change the test to
MSL = no limit, MWC = 1, clustering = ALE 50 MSL = no limit, MWC = 1, clustering = ILE 50 MSL = 3, MWC = 1, clustering = ALE 50 MSL = 3, MWC = 1, clustering = ILE 50 MSL = no limit, MWC = 3, clustering = ALE 50 MSL = no limit, MWC = 3, clustering = ILE 50
B) Add comparison of dict files in addition to comparison of parses
C) Make sure that both dicts and parses are the same under the same environment on different places
D) Make sure that tests are run by Circle-CI to prevent PR-s breaking tests
Done. PR #221.
Need to have fixed input parses (using only fully parsed sentences) corpus run with LG 5.5.1, with GT testing with same MWC that GL is using for learning.
When done, need to regenerate all baselines for GC with MWC = 1, 2, 6, 11, 21, 31