fictiveworks / CalyxSharp

Generative text processing for C# and Unity applications
Other
0 stars 0 forks source link

Weighted branch productions and definition parser #7

Closed maetl closed 2 years ago

maetl commented 2 years ago

Weighted probability branches represent a mapping of possible output strings to their probability of expanding to a result. The keys represent the possible output strings, and the values represent their probability of the string being selected.

Syntax

Supported intervals in dynamically typed contexts are:

Javascript/JSON

Ruby

Rules

For weightings specified on the interval 0..1, the combined sum of each node must sum to 1. This represents a % chance of picking that node.

Here, there should be a 20% chance of picking gold, 50% chance of picking silver and a 30% chance of picking bronze:

gold => 0.2
silver => 0.5
bronze => 0.3

For weightings with integers, the the chance of picking a single node is its value out of the sum of all values in the branch.

gold => 2
silver => 5
bronze => 3

Because JS doesn’t distinguish between floats and integers, but Ruby does, this behaviour hasn’t been well documented or specified properly in previous versions of the library.

Ruby also supports range objects as weights but I’m not convinced that adds much value and it is potentially confusing to users.

Tidier weighted selection

Current implementations are either highly specific to Ruby’s enumerable API or functional JS but not particularly well thought out.

Is there a simpler way to handle this by getting a random number between 0 and sum(weights) and then having a minimum result for each option?

Weights := A:1, B:2, C:3

Rand(0-5)
   <1 = A
   <3 = B
   Else C
maetl commented 2 years ago

Maybe there’s a smart way to build a factory/builder class or static method DSL to construct these, but afaict the simplest way to port the existing definition format from the dynamic languages is to overload the rule builder method similar to the following:

public Grammar Rule(string name, Dictionary<string, int> production)
maetl commented 2 years ago

We can remove the requirement for combined weights to sum to 1.

Given the following example...

ruby: 0.1
emerald: 0.3
sapphire: 0.4

Expected behaviour of the previous version was to throw an InvalidDefinition exception with the message “weights must sum to 1”.

The new expected behaviour should be to normalise each weight based on summing all the entries in the same way that integers work.

ruby: 0.1 (12.5%, or 1 in 8 chance)
emerald: 0.3 (37.5%, or 3 in 8 chance)
sapphire: 0.4 (50%, or 4 in 8 chance)
bentorkington commented 2 years ago

I experimented with using generics accepting the weight parameter, but hit a wall that doesn't look like it can be broken until we can use .NET 7 which will support an INumeric constraint on generics

There's no way to constrain the type sufficiently that LINQ .Sum() can be called on a generic array of weights yet. I think I'll write conveniences methods for all the anticipated types and convert to double internally, since the usual pitfalls of using floats don't seem to apply here - correct me if I'm wrong.

maetl commented 2 years ago

That’s good to know. It might be acceptable to start with just a single numeric type? Otherwise, normalising to a double to do the calculation makes sense as part of the grammar building and registration phase. Similar conversions are probably done in a few other places.

This is an area that could do with some usability testing/input from authors, as I don’t know which types are more intuitive to people to use as weightings. I tend to think of things in the 0..1 interval but I suspect fractions or whole numbers might be preferred by others.

With .NET 7, will also need to keep an eye on what Unity is doing around this. If it doesn’t have an impact on the byetcode assembly/DLL, then that’s all good, but if it changes anything at runtime then will need to ensure that the Unity platform tracks the .NET standard (usually it is a few versions behind, for example cool new C# features like value objects/tuples don’t work in Unity yet).

bentorkington commented 2 years ago

Tidier weighted selection

Current implementations are either highly specific to Ruby’s enumerable API or functional JS but not particularly well thought out.

Is there a simpler way to handle this by getting a random number between 0 and sum(weights) and then having a minimum result for each option?

To do this in C# I've used this (hopefully not too obscure) LINQ reduce:

      double max = sumOfWeights;
      double waterMark = options.Rng.NextDouble() * sumOfWeights;
      WeightedProduction production = productions.FirstOrDefault(wp => waterMark >= (max -= wp.Weight));
      return new Expansion(Exp.WeightedBranch, production.Production.Evaluate(options));
maetl commented 2 years ago

LINQ is good, I’m in favour of using it where possible as it has useful abstractions for dealing with common enumerable computations.

See comments here: https://github.com/fictiveworks/CalyxSharp/pull/12#discussion_r980742424