galaxykate / tracery

Tracery: a story-grammar generation library for javascript
Apache License 2.0
2.11k stars 246 forks source link

exploding grammars #5

Open Johnicholas opened 9 years ago

Johnicholas commented 9 years ago

I don't really understand the expansion process, but the probabilities in nested/recurrent grammars seem off. I was trying to build a "Sims patch notes" grammar, and I get this kind of output too frequently:

Fixed an issue in the Gallery by preventing the telescope in the woman with the woman in the telescope in the game with the man in the woman in the woman in the game with the man in the game with the telescope with the telescope with the man with the game with the dog with the telescope with the dog with the man with the dog in the dog in the dog in the woman with the woman in the dog in the woman in the dog in the telescope in the woman with the man with the woman in the telescope in the woman with the woman with the dog with the game with the dog with the man with the game with the telescope in the game in the man in the man in the dog in the game in the game with the dog with the dog with the game in the woman with the woman in the game in the telescope in the dog with the game in the dog with the woman in the woman with the game with the telescope in the woman with the game with the telescope in the woman in the game in the game in the game in the man in the woman with the telescope with the telescope with the dog with the dog in the game with the game with the dog with the telescope with the dog with the game in the man in the dog in the man with the woman in the dog in the dog with the woman with the woman in the telescope with the telescope in the woman in the man with the woman with the woman with the woman in the game with the dog with the dog with the man in the dog with the telescope with the game in the woman with the woman with the game with the woman with the telescope in the dog from thinking the man in the man with the game in the woman were the dog .

The problem comes from the np nonterminal in this grammar:

grammar = { in: "with in".split(" "), nn: "game man woman telescope dog".split(" "), vt: "saw".split(" "), vi: "sleeps".split(" "), np: [ "the #nn#", "#np# #pp#" ], vp: ["#vi#", "#vt# #np#", "#vp# #pp#"], s: ["#np# #vp#"], pp: ["#in# #np#"], }

Note that the pp is guaranteed to introduce another np. A different grammar shows the exploding possibility even more sharply:

grammar = { nn: "game man woman telescope dog".split(" "), np: [ "#nn#", "#np# #np#", "#np# #np# #np#" ], };

Could we address this somehow? Maybe just a "when to start stopping" parameter that adjusts the probability of choosing productions based on recursion depth or the size of the commitments that we already have, so that eventually the process would be guaranteed to finish up?

I think Phillipe Flajolet has done something with random generation of combinatorial structures that might be more subtle and sophisticated, such as: http://lara.epfl.ch/w/_media/calculus-random.pdf

Johnicholas commented 9 years ago

A workaround might be something like this:

function weighted(dict) {
  var out = []
  for (var key in dict) {
    for (var i = 0; i < dict[key]; i++) {
      out.push(key)
    }
  }
  return out
}

Use like this inside the grammar:

  basic_bug: weighted({issue: 10, bug: 1, crash: 0, defect: 1, freeze: 1}),
galaxykate commented 9 years ago

Hmm, I'll have to consider what the right way to implement this is. There's a pending feature to add different weighting functions for each rule list, the only gating feature is that I haven't decided on the syntax yet (finding a syntax that works well with some of the other advanced features that are planned)

I have planned to have both some default weighting (s-curves, exponential falloff, etc) and to provide a way for users to implement their own weighting functions. I hadn't considered using depth as an input parameter for the custom weighting functions, but I like it as an idea. When that gets implemented, I'll pass in the current tree node so that whoever implements a custom function has access to all possible information about the current expansion node, including depth.

Thanks for the ideas, and feedback.

On Thu, Apr 2, 2015 at 9:03 AM, Johnicholas Hines notifications@github.com wrote:

A workaround might be something like this:

function weighted(dict) { var out = [] for (var key in dict) { for (var i = 0; i < dict[key]; i++) { out.push(key) } } return out }

Use like this inside the grammar:

basic_bug: weighted({issue: 10, bug: 1, crash: 0, defect: 1, freeze: 1}),

— Reply to this email directly or view it on GitHub https://github.com/galaxykate/tracery/issues/5#issuecomment-88959204.