Add the (optional) ability to specify how many characters should be allocated to each trait
Add the (optional) ability to specify which other traits should be excluded, if one is chosen
The second one has a few caveats which may impact the way trait generation happens and the way traits are distributed.
Let's say we have two trait_types, "Foreground" and "Background", with three possible values each, Red (R), Blue (B), Green (G). There are therefore 3 × 3 possible combinations: {RR, RB, RG, BR, BB, BG, GR, GB, GG}. Let's say for both, R has a weight of 0.5, B 0.25, and G 0.25. We can imagine a binary encoding of 2 bits for each, such that:
Foreground: 00: Red, 01: Red, 10: Blue, 11: Green
Background: 00: Red, 01: Red, 10: Blue, 11: Green
So that a string 0110, for example would correspond to a Red (01) foreground and a Blue (10) background.
However, if we create an exclusion for "Foreground: Red, Background: Red", then the strings:
0000, 0001, 0100, 0101 become invalid, since Red cannot match with Red, and all those strings specify that combination.
The way we deal with this naively is to keep the first trait, and redistribute the probabilities for the rest across the non-colliding traits. So if we hit Red on the foreground, then the probabilities for the next trait would become Blue: 0.5, Green: 0.5, and could be distributed like so:
Blue: 00, 01
Green: 10, 11
And so the string 0000 (previously invalid) now becomes Foreground: Red, Background: Blue, instead of Foreground: Red, Background: Red.
Unfortunately, this has a few negative consequences:
You can no longer tell from a substring in isolation what its value is, since it depends on the traits before it; "00" could mean something different, even if in the same position, in different strings.
Probabilities will get moved around a little; in this case, with this implementation, the Background Red will not appear in 50% of the cases, but rather we expect it to appear in 25% of the cases. The foreground remains the same at 50%, since this implementation is asymmetric.
There are probably better algorithms for this, but we can see how this does and whether any re-evaluation is required.
The second one has a few caveats which may impact the way trait generation happens and the way traits are distributed.
Let's say we have two
trait_type
s, "Foreground" and "Background", with three possible values each, Red (R), Blue (B), Green (G). There are therefore 3 × 3 possible combinations: {RR, RB, RG, BR, BB, BG, GR, GB, GG}. Let's say for both, R has a weight of 0.5, B 0.25, and G 0.25. We can imagine a binary encoding of 2 bits for each, such that:So that a string 0110, for example would correspond to a Red (01) foreground and a Blue (10) background.
However, if we create an exclusion for "Foreground: Red, Background: Red", then the strings: 0000, 0001, 0100, 0101 become invalid, since Red cannot match with Red, and all those strings specify that combination.
The way we deal with this naively is to keep the first trait, and redistribute the probabilities for the rest across the non-colliding traits. So if we hit Red on the foreground, then the probabilities for the next trait would become Blue: 0.5, Green: 0.5, and could be distributed like so:
And so the string 0000 (previously invalid) now becomes Foreground: Red, Background: Blue, instead of Foreground: Red, Background: Red.
Unfortunately, this has a few negative consequences:
There are probably better algorithms for this, but we can see how this does and whether any re-evaluation is required.