renepickhardt / generalized-language-modeling-toolkit

Generalized Language Modeling toolkit
http://glm.rene-pickhardt.de
52 stars 17 forks source link

Multiple Estimators with differrent CountCaches fail because of SUBSTITURE_ESTIMAOR static member #90

Closed lschmelzeisen closed 9 years ago

lschmelzeisen commented 9 years ago

Small code to reproduce:

    Set<Pattern> patternsExpected = Patterns.getUsedPatterns(5,
            estimatorExpected, ProbMode.MARG);
    Set<Pattern> patternsActual = Patterns.getUsedPatterns(5,
            estimatorActual, ProbMode.MARG);

    LOGGER.debug("patternsExpcted = %s", patternsExpected);
    LOGGER.debug("patternsActual  = %s", patternsActual);

    TestCorpus testCorpus = TestCorpus.EN0008T;

    Calculator calculatorExpected = new SequenceCalculator();
    calculatorExpected.setEstimator(estimatorExpected);
    calculatorExpected.setProbMode(ProbMode.MARG);
    estimatorExpected.setCountCache(testCorpus.getCountCache(patternsExpected));

    Calculator calculatorActual = new SequenceCalculator();
    calculatorActual.setEstimator(estimatorActual);
    calculatorActual.setProbMode(ProbMode.MARG);
    estimatorActual.setCountCache(testCorpus.getCountCache(patternsActual));

    List<String> sequence = Arrays.asList("4", ".", "3", "speak", "an");
    System.out.println(calculatorExpected.probability(sequence));
    System.out.println(calculatorActual.probability(sequence));
lschmelzeisen commented 9 years ago

Could not reproduce.

lschmelzeisen commented 9 years ago

Thought about it again, could reproduce and fixed it.