Remove double normalization of probabilistic grammars

Herb-AI / HerbGrammar.jl

Grammars for Herb.jl

https://herb-ai.github.io/

MIT License

0 stars 2 forks source link

Remove double normalization of probabilistic grammars #82

Closed nicolaefilat closed 3 months ago

nicolaefilat commented 3 months ago

In the end the normalization changes were not required. This PR just removes a double normalization from the code

ReubenJ commented 3 months ago

Thanks! Let's keep the default behavior as-is (normalize! normalizes the probabilities per type)—this is the classic definition of a PC[F|S]G. For Probe, we need this strategy of normalizing over all rules, disregarding type. Some options.

Add something like bytype with a default of true:

function normalize!(g, bytype=true)
    ...
end

Maybe more explicit with an Enum:

@enum NormalizationStrategy bytype all

function normalize!(g, strategy=NormalizationStrategy.bytype)

Can we think of other ways to normalize? Both solutions are a bit rigid because users of Herb couldn't cleanly introduce their own normalization strategy without modifying/contributing to the package to either add/change the keyword or add an option to the enum. A clean solution might introduce a trait for normalization strategy and a different type for Probe grammars—this smells of overengineering for the time being. @sebdumancic @THinnerichs @pwochner thoughts?

codecov[bot] commented 3 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 45.15%. Comparing base (93bec01) to head (c853358). Report is 4 commits behind head on dev.

Additional details and impacted files

```diff @@ Coverage Diff @@ ## dev #82 +/- ## ========================================== - Coverage 46.10% 45.15% -0.95% ========================================== Files 8 8 Lines 462 454 -8 ========================================== - Hits 213 205 -8 Misses 249 249 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

THinnerichs commented 3 months ago

Don't think another normalization strategy is necessary, and we should leave it indeed with bytype. This is what we usually need during enumeration/exploration/iterating programs. Just make sure you then also normalize with respect to the current domain, as some rules might be pruned by our constraint system.
Do you actually need it over all rules?

This is the probability update from the Probe paper

nicolaefilat commented 3 months ago

@THinnerichs you are right, for the grammar update we do not need that but to initialize the probe search procedure we have to start from every rule having the same probability.

ReubenJ commented 3 months ago

If this probe-specific way of updating grammars is indeed probe-specific, then maybe we should not introduce this change at all and make the probability update procedure for Probe into a different function. Would this make things difficult for your implementation, @nicolaefilat?

sebdumancic commented 3 months ago

yes, indeed, probabilistic grammars should normalise per type of rule. Otherwise, they lose their meaning. It might be that Probe actually does that es well, it is just a bit convoluted explained. (and again, we can change Probe for the better ;) )

To get rules of the same probability, in probabilistic grammar, that would actually mean the following:

every type has the same probability of being selected ; let's term that as $p_{type}$
every rule of the particular type, let's say there are $N$ of them, has the probability $\frac{p_{type}}{N}$

I don't think Probe would behave conceptually differently, if we do the updates per type.

nicolaefilat commented 3 months ago

I ran the probe tests again without normalizing on all the rules and it seems to work. @sebdumancic you were right!