Open cpressey opened 2 years ago
An inkling, of course, is a mythical, magical creature that stands about two feet tall and is make of ink. They've been known to conceal themselves in manuscripts, Chinese landscape paintings, and similar objects. Scholars generally assume they're related to fornits.
And so, when I said that I already have some inkling of a certain kind, I presumably meant that some number of these creatures have recently taken up residence in my domicile.
Why would this happen? Traditionally it is thought that Hallowe'en is the time of year when the veil between this world and the spirit world is at its thinnest; and, if the spirit world is where the inkling originate from, then around now would be the most favourable time to journey to this world. And this, in turn, might help explain why NaNoWriMo/NaNoGenMo takes place in November: high local inkling population!
So, as a sort of warm-up exercise (because it's not quite clear to me how this is related to the, uh, main local inkling infestation), I started working on another release of Samovar.
In particular, doing something I said I'd like to do back in 2018, which is to implement a "real" solver for it and generate an "efficient" version of The League of Extraordinarily Dull Gentlemen.
Well, I've implemented two new solvers in samovar
now; one uses breadth-first search (BFS) and the other uses depth-first search (DFS).
The BFS solver is not speedy, but it does generate chapters that are optimal, in the sense that they have the shortest sequence of events that leads to the goal.
The DFS solver is quite fast, but it often follows a path that has a lot of apparent repetition before it reaches the goal.
Some sample output can be seen in this gist: "The League of Efficiently Dull Gentlemen"
There's a catch though, and it's a big one: these solvers don't work on every chapter.
Chapters 2 and 12 are fine, because there is only one character, and they follow a virtually linear series of plot events.
But in others -- chapter 1, for instance, where the goal is for every one of the 4 characters to have greeted each of the others -- there is a definite combinatorial explosion going on, and these naive exhaustive-searchers can't handle it. They take eons and chew up memory.
I wouldn't have thought it, but the random-walk, throw-lightning-bolts-til-you-hit-it approach that samovar
has used up til now -- terrible as it is -- is possibly actually more efficient on these chapters than the searchers are.
I could go further with this, and try to come up with a heuristic that allows these solvers to do directed search. But I don't think I'm quite interested enough to keep going with that. Samovar is basically a blocks-world, and efficient search for paths to the goal in a blocks-world has been well studied, and I'm not hugely interested in it.
But who knows. What I'd really like to know is how, if at all, this relates to the, as I say, main infestation. And that's what I'll probably be working on trying to understand, instead.
How about now? Have I got all my inklings in a row now? Not nearly. An inkling in the hand is worth two in the pen. Now that's an inkling of a different colour. But I digress.
I can say this much: I'm interested in formal language theory and, increasingly, logic.
There's a nice correspondence between the two: A grammar is a logical calculus where the terminals are axioms, and the non-terminals are rules of inference. A sentence is a theorem, and its parse tree is the proof of that theorem.
I'm interested in the fact that context-free grammars can do a lot, and they do it really prettily. But it's not quite enough. How much more do you need?
A context-free plot generator can produce only context-free plots -- hierarchically nested, last-in-first-out, neat and tidy plots. But good plots have twists. Twists are not neat and tidy. Twists are not context-free.
Generating plot twists would presumably require a generator that is context-sensitive.
You can go beyond context-sensitive grammars, to unrestricted grammars. They're equivalent to Turing machines. At that point though, why bother with thinking of them as grammars at all?
I don't know enough about context-sensitive grammars as I'd like to. I might try studying them and writing a generator based on them. If I could get it to generate recognizable plot twists -- not even good ones, just ones recognizable as plot twists -- I'd be quite happy with that.
Generating plot twists would presumably require a generator that is context-sensitive.
This turns out to be not quite what I meant when I wrote it.
The defining characteristic of a plot twist is that the reader didn't expect it. Like a stage magician, the novel got you thinking the ace of spades was in the hat, but actually the ace of spades was... behind your ear and in the hat is a rabbit!!!
Now, I am definitely not planning on writing a novel generator that forms a model of the expectations of the reader and computes plausible ways to subvert them. That's... a good example of a bar that's been set probably far too high.
And I do rather suspect now that you can get the effect of a plot twist with a simple algorithm and a bit of luck, but that's another story.
So instead of plot twist imagine I said moderately complex plot or something, even though that comes close to making the statement into a tautology.
Let's put it this way. Lenny is looking for X. Patricia is looking for Y. They meet-cute. Hilarity ensues. But will Lenny find X and Patricia find Y? And if they can find them, can they do so in a manner which is context free? And if not, how much context sensitivity will we require of our generator in order to see them fulfill their dreams???
This is still probably way too high a bar. But it's interesting to think about.
Oh yes. I see what's going on now with these inklings. They're trying to trick me, they are. Trick me into feeding them after midnight. I overheard them plotting. Gonna inject me with sodium pentathol so that I spend every day of November just writing down my genius-ass ideas about formal languages and narrative structure on my submission issue instead of working on a generator. Before long this issue will be muted by all as I natter away to myself, and when 23.59.59 approaches on 30.11.2022 I will hastily submit an untested, 4-line version of 50,000 Meows to pretend the month wasn't an unmitigated failure -- then collapse in an exhaused heap, spilling the Ritz crackers I was snacking on all over the keyboard, at which point (00.00.01 on 01.12.2022) they will scarf them and transform into the little devils they truly are and wreak havoc on the local townscape. Oh yes it's all very clear now. Now if you'll excuse me I need to go find a baseball bat so that I can sleep fitfully with it never out of my grasp.
A grammar is a logical calculus where the terminals are axioms, and the non-terminals are rules of inference.
That's a bit sloppy. The terminals are axioms and the productions are rules of inference. I've become too used to context-free grammars in EBNF, where there's a one-to-one correspondence between non-terminals and productions.
Logical calculi are such a beautiful way to frame grammars that I'm disappointed that it's not the standard way. You do see it sometimes -- I've seen it in discussion of Lambek grammars (which I don't hardly understand) -- but it's the exception rather than the rule.
But, I guess explaining grammars this way presupposes some knowledge of logic. Which is a surprisingly less well-understood subject than you might think. A lot of people think that logic is about truth, but that's an exaggeration. Logic is about preserving properties of things. We have some things with some property, and we have some transformations on those things that preserve that property. A proof is a sequence of applications of those transformations that shows that something does have the property in question. That property can be truth, but it can also be knowledge, or possibility, or provability, or...
And this is the kind of thing I'm talking about if I ever refer to the logic of storytelling. We have some things that have storyness, and some transformations that preserve storyness. Of course, this is an abstraction, and it can only approximate the real situation; but think of all the plot holes in all the movies you've ever watched. (Have you ever seen a movie that didn't have a plot hole in it?) And yet, people watch the movie, they enjoy it, the movie sells. There must be something more powerful than simple consistency-checking of the presented facts that is in play when a movie is watched. Indeed, the viewer is willing to suspend disbelief -- refrain from applying "realistic logic" -- in the interests of the story carrying on as a story -- following the "storytelling logic".
So of course if a novel generator uses only "realistic logic" to ensure the events in the story world are consistent, it is, in some sense, using the wrong logic.
The storyness-preserving transformations involved in storytelling logic are presumably much, much more involved than those in the kinds of logics that people have been able to formulate. People have been able to formulate some of them, informally -- we call them "tropes", right? -- but coming up with formal versions of them, rigorous rules of inference that could be mechanically applied to a state space... that's a tall order. (And it would still only get you partway to a passable novel, as there are all these other things like diction and pacing and relatability of the characters that "storytelling logic" doesn't directly address.)
A grammar describes a language.
Given a grammar, we can write a program that recognizes the utterances of the language it describes. Or we can equally well write a program that generates the utterances of that language.
I feel a need to stress this point, because an historically important problem in computer science has been efficient parsing. For programming languages especially, when it comes to grammars, there has been a lot more research into parsing than into generation. We even have grammar formalisms that have "Parsing" right in their name (namely PEGs). So, I want to be clear that I'm trying to redress the balance here: a grammar describes a language, but does not dictate what you do with it.
Actually, if we are so inclined, we can write a single program that runs in two modes, one where it recognizes utterances of the language and one where it generates them. I know this because I wrote a program that does just that a while back (actually, quite a while back: the gist is from 2015 but the code is older, and I believe it dates from before the first NaNoGenMo. But I'm far from certain on that; the memories are too fuzzy).
But that was something of a lark -- a procedural, object-oriented lark. We can do better (in the sense of: more aesthetically pleasing).
A grammar is a relation between utterances of a language and their meanings, and we can treat it directly as such.
Relations are many-to-many; one utterance may have more than one meaning, one meaning may have more than one utterance.
And again, the grammar itself doesn't care whether we go from utterance to meaning, or meaning to utterance.
And there are programming languages where we can code up this relation, not as a procedure, nor even as a function, but directly as a relation. These are, unsurprisingly, relational programming languages. And the relational programming language itself provides the means of running it in either of these two directions.
In terms of a NaNoGenMo goal, this could be stated: write a novel generator that can recognize its own novels.
And it's increasingly looking like that will be part of my goal this year.
In formal language theory, a language is a set of strings.
At various points I've almost found that definition offensive. Surely a mere set of strings doesn't deserve to be called a language? A language is for communication, isn't it? What does a string communicate, if all you can say about it is that it is or isn't present in some set?
Well, "language" is just a technical term here of course, and there are reasons for this choice -- arguably good reasons when seen in context. Historically, it appears to have come down from formal logic, but I shall say no more about that, lest I start babbling about non-standard models -- I mean, don't we already have enough problems?
In the view that a language is a relation between utterances and meanings, the approach taken by formal language theory is to choose the most basic possible set of meanings for utterances. It's a binary choice, 0 or 1, either the utterance is valid or the utterance is invalid, and that's it, that's all, finis. And that's formal language theory's prerogative, right? We can make other choices for what constitutes a meaning, if we want. And we do. Linguists do and compiler writers do. And generative novelists do -- don't we?
And, although they are in the minority, there are grammar formalisms that support this idea. What I'm thinking of in particular are attribute grammars.
But again, efficient parsing has kind of sucked all the oxygen out of the room here (to borrow a lovely phrase that I've heard journalists use in reference to a certain ex-Prime Minister). Research on attribute grammars has been all like in what ways can we restrict this darn thing so we can generate an efficient parser from it. Seriously, that's what all that S-attributed, LR-attributed jazz is all about. The pared-down "attribute-grammar-inspired" DSLs found in "compiler compilers" have made many concessions in accomodating the (almost always imperative) target languages that they do. If you didn't look beyond these tools and their self-imposed limitations you might almost think you can use attribute grammars only for parsing and not for generation.
But happily, that's not the case. The rules in a production in an attribute grammar define (guess what) a relation between the attributes (and the terminals and non-terminals) in that production. In this view, even the distinction between "synthesized" and "inherited" attributes disappears (well -- ok -- not exactly -- but you'll no doubt hear more from me about that soon enough, given the really impressively large stocks of sodium pentathol that still remain in the inklings' arsenal).
Stated as a NaNoGenMo goal, this could be: write a novel generator that generates a novel using an attribute grammar.
I plan to do that, or at least to give it the ol' college try.
The most well-known relational programming language is certainly Prolog.
Prolog is usually called a "logic programming" language, but again, this is an exaggeration. Prolog's execution mechanism is based on an algorithm called "SLD-resolution" that was discovered when searching for algorithms to perform theorem proving. As theorem proving algorithms go, SLD-resolution has some nice properties, and also some not-terribly-nice properties.
But when more people write code in Prolog, do you really think are they thinking, "Here we go, let's prove this here theorem"? You can (and I've seen at least one Prolog textbook that does) explain Prolog with no more recourse to logic than the recourse to logic you'd need to properly explain any programming language. And you can, if you like, describe any programming language as a logic, and any interpreter as a theorem prover for that logic; it's just that these logics are by and large deterministic; which is an entirely acceptable thing for a logic to be, it's just that we don't usually call such a thing a logic.
So, no. I will happily drop the "logic programming" terminology on the ground, here. Prolog is a relational programming language. Pick up any textbook on Prolog (and there are scads of them), turn to the introductory chapter. What's the first example, inevitably? Family relations. Literally it shows working with relations. You tell Prolog who is related to who, you ask Prolog who is related to who. Also, you're inevitably told that Prolog maintains a "database of facts" that the user can "query" -- the same terminology used by relational databases.
On the other hand, a programming language that does usually get called a relational programming language, is miniKanren. But this is a bit of a misnomer too, because miniKanren isn't a full programming language; it's an embedded DSL, usually embedded in Scheme.
miniKanren is also a lot younger than Prolog and is not supported by scads of textbooks. What materials there are on it are written in an incredibly terse academic style. Even the one significant textbook, "The Reasoned Schemer", is written in "programed style", descending as it does from "The Little LISPer", which was written when programed style was in vogue. I think it's gone out of fashion these days. Well, who'm I kidding. Books have gone out of fashion these days, haven't they? How can a textbook, with its dry and abstract written words, compare to the existential comfort of the virtual presence of a fellow primate, the soothing sound of their voice and the affect-inducing sight of their face in a training video? But I digress. If I continue in this vein, you'll peg me as a malcontent, and we all know what happens to malcontents. Need to get back on track!
So, miniKanren is, more or less, a relational programming language (though I've heard that some people call it "pure logic programming" but I'm not even going to go there -- see above.) The core thing is that you define relations, compose relations, and query relations. To query a relation, you need to search for all the satisfying solutions, because relations are, in the general case, many-to-many; there may be more than one solution, on either side.
miniKanren has one significant difference from Prolog: when searching for solutions, it uses breadth-first search (BFS). There are Prolog implementations that use BFS too, but doing so is far from standard (and frankly I've found them hard to find); the standard is depth-first search (DFS), because that's what SLD-resolution uses.
BFS has some nice properties here, like: miniKanren can find solutions even when the search space is infinite. It doesn't choke on "left-recursion".
miniKanren also has relational numbers. I suppose you could build relational numbers in Prolog too, but I've never seen that; it just uses regular machine numbers (for efficiency) and takes the hit that they do not participate well with the search mechanism. Prolog programs are often not designed to run in both directions. Using numbers is one place where Prolog's support for this is poor. Relational numbers support this much better. miniKanren programs often are designed to run in both directions.
All this typing, and for what? For saying that I will probably write my novel generator in Prolog, and then, if it has problems with being run in both directions (which it very conceivably will), I shall try my best to translate it to miniKanren in the hopes of overcoming that.
Does minikanren support DCGs? They are, if I recall, part of ANSI prolog, and a very good fit for combination parser-generators conceptually (though I've never seen one that wasn't dog-slow even by prolog standards).
On Wed, Nov 2, 2022, 11:40 PM Chris Pressey @.***> wrote:
The most well-known relational programming language is certainly Prolog.
Prolog is usually called a "logic programming" language, but again, this is an exaggeration. Prolog's execution mechanism is based on an algorithm called "SLD-resolution" that was discovered when searching for algorithms to perform theorem proving. As theorem proving algorithms go, SLD-resolution has some nice properties, and also some not-terribly-nice properties.
But when more people write code in Prolog, do you really think are they thinking, "Here we go, let's prove this here theorem"? You can (and I've seen at least one Prolog textbook that does) explain Prolog with no more recourse to logic than the recourse to logic you'd need to properly explain any programming language. And you can, if you like, describe any programming language as a logic, and any interpreter as a theorem prover for that logic; it's just that these logics are by and large deterministic; which is an entirely acceptable thing for a logic to be, it's just that we don't usually call such a thing a logic.
So, no. I will happily drop the "logic programming" terminology on the ground, here. Prolog is a relational programming language. Pick up any textbook on Prolog (and there are scads of them), turn to the introductory chapter. What's the first example, inevitably? Family relations. Literally it shows working with relations. You tell Prolog who is related to who, you ask Prolog who is related to who. Also, you're inevitably told that Prolog maintains a "database of facts" that the user can "query" -- the same terminology used by relational databases.
On the other hand, a programming language that does usually get called a relational programming language, is miniKanren. But this is a bit of a misnomer too, because miniKanren isn't a full programming language; it's an embedded DSL, usually embedded in Scheme.
miniKanren is also a lot younger than Prolog and is not supported by scads of textbooks. What materials there are on it are written in an incredibly terse academic style. Even the one significant textbook, "The Reasoned Schemer", is written in "programed style", descending as it does from "The Little LISPer", which was written when programed style was in vogue. I think it's gone out of fashion these days. Well, who'm I kidding. Books have gone out of fashion these days, haven't they? How can a textbook, with its dry and abstract written words, compare to the existential comfort of the virtual presence of a fellow primate, the soothing sound of their voice and the affect-inducing sight of their face in a training video? But I digress. If I continue in this vein, you'll peg me as a malcontent, and we all know what happens to malcontents. Need to get back on track!
So, miniKanren is, more or less, a relational programming language (though I've heard that some people call it "pure logic programming" but I'm not even going to go there -- see above.) The core thing is that you define relations, compose relations, and query relations. To query a relation, you need to search for all the satisfying solutions, because relations are, in the general case, many-to-many; there may be more than one solution, on either side.
miniKanren has one significant difference from Prolog: when searching for solutions, it uses breadth-first search (BFS). There are Prolog implementations that use BFS too, but doing so is far from standard (and frankly I've found them hard to find); the standard is depth-first search (DFS), because that's what SLD-resolution uses.
BFS has some nice properties here, like: miniKanren can find solutions even when the search space is infinite. It doesn't choke on "left-recursion".
miniKanren also has relational numbers. I suppose you could build relational numbers in Prolog too, but I've never seen that; it just uses regular machine numbers (for efficiency) and takes the hit that they do not participate well with the search mechanism. Prolog programs are often not designed to run in both directions. Using numbers is one place where Prolog's support for this is poor. Relational numbers support this much better. miniKanren programs often are designed to run in both directions.
All this typing, and for what? For saying that I will probably write my novel generator in Prolog, and then, if it has problems with being run in both directions (which it very conceivably will), I shall try my best to translate it to miniKanren in the hopes of overcoming that.
— Reply to this email directly, view it on GitHub https://github.com/NaNoGenMo/2022/issues/3#issuecomment-1301606300, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADXUGJCYZL774WQPO76NFTWGMX4FANCNFSM6AAAAAARH2V6XE . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Does minikanren support DCGs?
It's a hard question to answer properly. miniKanren is still very much a research language. Here is its GitHub presence and here is its website, which lists major miniKanren-related works.
I think the best answer might be, "If you are interested in miniKanren and you want it to do X, you are expected to implement X yourself and write a short paper on it and submit it to a conference and give a talk on it."
So, unify X with DCG and there's your answer. Except of course no conference is going to accept a paper merely on something as well-established as definite clause grammars, so what you get when you unify X with DCG in the above tends to look something like this. (Which I will present without further comment, difficult as that may be to resist.)
What you don't get is a tidy 60-or-so-line Scheme source file that someone like myself would be able to easily build upon in this situation.
The good news is that it's easy to mechanically translate DCG syntax to Prolog, which I've done by hand here, and then mechanically translate the Prolog to miniKanren, which I've done by hand here. It will probably be of great benefit to not have to do it by hand this month, so I probably should write a translator for it at some point.
So where was I? Oh yes. A grammar defines a language and a language is a relation between utterances and meanings (that is, between syntax and semantics) and relational programming languages code relations directly, so it is unsurprising to find that when a grammar is coded in a relational language there are natural patterns this coding falls into. One such natural pattern was noticed long ago by Prolog researchers, and a custom-fitting syntax was proposed for it, and it was christened definite clause grammars (DCG's).
In a DCG, each production is a rule. The rule has arguments representing the part of the input string that has been parsed, and the part that has yet to be parsed. (In the custom-fitting DCG syntax, these arguments are hidden for convenience). Extra arguments beyond those required ones may optionally be defined on a rule.
When there are no extra arguments, a grammar in a DCG can be, at most, context-free. But if extra arguments are defined and supplied, they can be passed up and down between the rules, making the grammar context-sensitive. If further constraints are also applied to these variables, e.g. arbitrary Prolog rules, the grammar may become unrestricted.
In an eerily similar way, an attribute grammar without any attributes is, at most, a context free grammar. Once attributes are defined and supplied, they can be passed up and down between the productions, making the grammar context-sensitive. If further constraints are applied to the attributes (usually the attribute grammar formalism leaves this option open, because, like DCGs, it was designed for practical programming and not for formal language theory) the grammar may become unrestricted.
In an attribute grammar we say some attributes are "inherited" and some are "synthesized"; in Prolog documentation we find some arguments described as "input arguments" and some as "output arguments". These correspond exactly.
I'll go even further: try as I might, I cannot find any meaningful characteristic which distinguishes an attribute grammar from a DCG. Any difference I might be tempted to mention, when I think about it for a moment, always turns out to be an implementation detail stemming from the legacy of either Prolog where DCGs have been implemented, or from compiler-compiler-type tools where attribute grammars have been implemented.
They are the same thing.
So. Fascinating and eye-wateringly technical revelation is fascinating and eye-wateringly technical. Not really a new discovery -- all along there were hints and allegations. But does it help?
Well, it helps me reach the two goals I've identified so far -- because now I can choose an attribute grammar, cast it as (I was going to say "convert it to" but we've seen now how that's far too strong a choice of words) a definite-clause grammar in Prolog, write a little bit of driver code alongside it, and end up with a generator that (a) generates a novel using an attribute grammar and (b) can recognize its own novels.
And, it might help in better understanding what it means to extend a context-free grammar to a context-sensitive one.
But why was I thinking about context-sensitive vs. context-free in the first place? Because I was thinking about plot. I was thinking about how many generators are basically context-free, often with a small amount of context-sensitivity thrown in. MARYSUE generates a "plot" by successively rewriting the existing storyline, but when you strip away the implementation details, it is not really different from a generator based on a context-free grammar (with a small amount of context-sensitivity thrown in).
Goal-wise, I would slightly prefer if my generated novel had at least a passing resemblance to a story. And there are still 3 weeks left. So it might be good to think about this part more.
I can describe the problem I see from another angle. In a story, characters get into scrapes. Then they get out of them again. Sometimes, they get into a scrape while they're in another scrape. (Our heroes' spaceship is tractor-beamed into a space station. While in the space station, they need to break a prisoner out of jail. While breaking a prisoner out of jail, they fall into a trash compactor.) This is a stack of scrapes that must be escaped in reverse order.
Or yet another angle. People use the phrase "plot threads" sometimes, and even though it's quite informal, surely it means something. If one were to try to investigate their structure? A plot thread starts at some point, and becomes apparent in the story at some point -- often the same point, but not always. They terminate at some point. (Perhaps some were red herrings and were never resolved -- either those were not really plot threads, or, if we insist they were, then they terminate, along with all their fellow plot threads, at the end of the story, whether they were resolved or not.)
Now, some plot threads must nest inside other plot threads. (Our heroes must escape the trash compactor, then the jail, then the space station, in that order.) But clearly they need not always nest inside other plot threads.
Instead of thinking of this as nesting, it would probably be better to think of it as a matter of prerequisites. Then we can think of it in purely abstract terms as, not a first-in-first-out stack, but as a dependency graph.
The purely formal question arises: Is a dependency graph capturable in a context-sensitive grammar? This has probably been studied under some guise somewhere; I'd need to find that guise and unmask it, though. (Insert Scooby Doo gag here.) And then the less-formal follow-on questions: is it capturable in a "nice way"? Is there an aesthetically pleasing way to embed it? Is there a way that is practical for the purposes of running an attribute grammar in the "generate" direction?
I hesitate to make this an actual goal, but I'll probably spend some time looking into it. I suspect the answer is not very complicated. This is of course merely a suspicion, at present.
I might also try looking up some synopses of stories, identifying the plot threads, and drawing their dependency graphs. Just to try to gauge what kind of complexity is typical for these.
Is a dependency graph capturable in a context-sensitive grammar?
The answer is yes -- it's almost trivial. (I did suspect it was straightforward but I wanted to think about it.) If A depends on B and C, then the production that sets A to true, has to require that B and C are both true first -- that's all.
Such an attribute grammar can be seen as the intersection of a CFG with a relational program that performs a particular fixed topological sort. A is synthesized attribute and B and C are inherited attributes for this production.
More generally, an attribute grammar (or equivalently, DCG) can be seen as the intersection of a CFG and a relation. And since a CFG also defines a relation, it can be seen even more generally as a way to take the intersection of relations. So for example, you could have the intersection of two (or more) CFGs.
Arranging things this way might make the presentation of the problem conceptually simpler, but what's harder is making sure this is something that can be run in both directions. In particular, recognizing a set of strings generally means "For every possible string, if the string is in the language, I'll say YES, and if the string is not in the language, I'll say NO". But if the language is not recursive, but only recursively enumerable, the best you can actually do is "If the string is in the language, I'll say YES, and if the string is not in the language, I won't say YES -- I might in fact just loop forever". The word that's generally used for this is to say the thing "accepts" the strings of the language, rather than "recognizes" them.
Even though I haven't been dealing with languages of this complexity, playing with attribute grammars like this inches closer to it. I just spoke of taking the intersection of two CFGs, and the question of whether the intersection of two CFGs is empty or not is undecidable, i.e. only recursively enumerable.
Even if we're careful to avoid such languages, there is still the possibility in a recursive language, that while we can tell the string is in the language in polynomial time (i.e. reasonably efficiently), we can only tell that the string is not in the language in nondeterministic-polynomial time (i.e. unreasonably inefficiently).
For short strings this might not make a difference, but for 50,000 words I think it definitely will.
And this possibility is quite real. I've run up against it when implementing sketches. miniKanren does not magically solve this. The fact that miniKanren numbers are relational is not unilaterally beneficial -- it now means we can combinatorially explode when searching for the right numbers, oh joy!
And miniKanren doesn't have Prolog's "cut" operator. As much as I abhor "cut" on an aesthetic basis, I can see why you might want it to overcome situations just like this. Even if I do the final thing in Prolog, I really don't want to use "cut".
So I am going to beg off from the stronger meaning of "recognize" and have the goal be a generator that can identify its own novels, where "identify" means just the same as "accept" does in formal language theory, hopefully just sounding a bit more intuitive to a general audience.
At least the inkling horde seem to be running low on sodium pentathol.
I read a "writing advice" blog post about plot threads; the author's idea of their structure generally agreed with mine. Their definition of a plot thread was roughly "anything that the audience will feel is unresolved", and, well, fair enough.
Since my self-imposed bar for narrative is quite low ("I'd prefer if my novel had at least a passing resemblance to a story") I've decided that a missing pet makes a good general-purpose plot thread. If you're reading a story and a pet goes missing, I expect that's something you will feel is unresolved.
So the attribute grammar from which I am planning to generate this 50,000-word travesty currently looks like this:
Story<K> ::= TheEnd<K>
| RunAwayEvent<K, K'> Story<K'>
| ComeBackEvent<K, K'> Story<K'>.
TheEnd<K> ::= "the" "end" { K = {cat, dog, gerbil} }.
RunAwayEvent<K, K'> ::= "the" Pet<P> "ran" "away" "." { P ∈ K, K' = K - {P} }.
ComeBackEvent<K, K'> ::= "the" Pet<P> "came" "back" "." { P ∉ K, K' = K ∪ {P} }.
Pet<P> ::= "dog" | "cat" | "gerbil".
It's not context-free. If it were context free, if the dog ran away then the cat ran away, the dog would not be able to come back until the cat came back. But it can be the case that the dog ran away then the cat ran away then the dog came back then the cat came back. So it's not context free. You wouldn't be able to write this generator in [insert your favourite CFG-based framework here].
I've prototyped it in miniKanren and it does generate stories with consistently happy endings. The K
attribute is the set of animals at home (K is for "kennel"). The P
attribute is an animal. The constraints on the RunAwayEvent
and ComeBackEvent
prevent the Pet
production from picking an animal that is not in the kennel, or already in the kennel, respectively.
The constraint on TheEnd
is what ensures it is a happy ending. The Story
does have to be started with a full kennel (not shown) in order for the story to eventually reach TheEnd
.
Certain hurdles still need to be overcome, such as scaling this up to 50,000 words, but this is progress.
I need me to stay late tonight. I really need to catch that red dot.
The problem with attribute grammars Clark Kent definite clause grammars Superman is that they're too powerful. Yes, you can write a context-sensitive grammar using them, but there's nothing but your good sense Green Kryptonite stopping you from going far beyond that.
It's a relational program with a token stream tacitly attached. Take away the token stream and it would just be a relational program, and in making one, you'd just be programming. It's an interesting development but I do wish it felt more like a grammar. (Not that generating a novel using just a grammar is a good idea, of course, but it's appealing, at least to me, for other reasons -- probably bad ones.)
So one option that should be obvious is to implement the formalism of context-sensitive grammars as it's stated. There's absolutely nothing stopping me from doing this, and it directly addresses the problem. So why not?
The first meaningless objection is that the CSG definition doesn't seem to map well to "the usual" implementation methods. Whereas in a CFG your productions look like this:
A ::= "B" C "D".
with exactly one thing (a nonterminal) on the LHS, in a CSG we have productions that look like this:
"foo" "bar" A Z ::= "foo" "bar" "B" C "D" Z.
i.e. several things on the LHS. When there is only one thing on the LHS, it doesn't take any imagination to map it to a named thing in a program; a function or a procedure or a relation. When there are multiple things, we need to devise some kind of method.
Two methods suggest themselves. One is to note that, by the definition of CSGs, the extra junk on the LHS is the same extra junk as on the RHS, so you could frame the rule slightly differently, maybe like
"foo" "bar" [ A ::= "B" C "D" ] Z.
and maybe that would help. In an attribute grammar, for example, any time you descend into this production (which we can now call A, after the one nonterminal on the LHS), you need to pass down two attributes containing context, from the LHS and RHS. And this production constrains the LHS context to be "foo" "bar" and the RHS context to be Z.
This would involve either a discipline for writing attribute grammars in this way, or some kind of tool that generates the attribute grammar, given a CSG description.
The other method would be to store both terminals and nonterminals in a big string of terminals and nonterminals and basically treat this as a rewriting rule. This would work, and the idea extends naturally to unrestricted grammars too, if you wanted to go that far Phantom Zone.
And, with a good design, this would be reversible; the same rules that rewrite a string containing a single nonterminal (e.g Sentence) into a string of terminals (e.g. the god ran away) could just as easily be applied in the other direction, to rewrite the string of terminals into a goal nonterminal (i.e. Sentence) and report that the string was parsed successfully.
These are just ideas. I don't know what I'll have the time and energy and inclination to actually build in what remains of November.
And the red dot keeps moving still! Even CSGs are ridiculously powerful - I am given to understand that the problem of determining membership in a CSG is a PSPACE-complete problem. Wikipedia has some reasonably good information about PSPACE-completeness. A number of board games and puzzles often fall into this class. Does generating (or comprehending) a (satsifying plot for a) novel -- assuming we could reduce such a problem to some essential core with a formal structure, which of course we can't, but do play along Jimmy Olsen -- is generating or comprehending a novel enough like a board game or puzzle that it also falls into PSPACE?
The second meaningless objection (which I totally forgot to mention) is who wants to program in a grammar anyway though?. The contortions involved in taking what you want to say and formulating it as a context-sensitive grammar are not the sort of thing you think of as "programming". They're the sort of thing that's behind those gawd-awful "constructions" found in proofs in papers on computational complexity and the like. Yes, there are some people who enjoy writing these sorts of things, but no, I'm not one of them (I might design esolangs but I don't code in them much) and while I do enjoy reading them sometimes (the construction in that paper by Ladner about NP-intermediate classes is a real doozy) and I suppose it would be an impressive NaNoGenMo entry in some sense, it's not really what I'm going for.
(But to consider what it might look like, in more detail -- for the missing-pets novel at least -- I'm sure some of the same techniques as you see in languages like Thue apply. You want to carry along some state, but you don't have the luxury of being able to give it a name, or even to look in another partition of the data for it. So, like a handyman's toolbag, you keep it next to you at all times. When you move, you move it along with you. That is, at least, how I would attack it.)
My point, in reaching for a CSG, was to have a generator based on a mathematical object that is "nicer" than a typical computer program. "Nicer" meaning simpler in structure, more tractable, easier to analyze, more composable. Well, a relational program is already not bad on that score. And, a CSG is maybe not as great on that score as one would like -- certainly CSGs aren't as tractable etc as CFGs. So let's not rush in here.
On another leg of this journey on the Logorrhea Express (choo choo) I mentioned that two methods suggested themselves: implementing a CSG processor directly and examining how to translate CSGs to attribute grammars. They're both fascinating rabbit holes and I almost went down the former one. But I came into this month with a full set of pre-loved rabbit holes in my backyard already, and I don't think I could do it justice in what's left of the month. The latter is a little closer to addressing what I've written above. It's more like, how can we apply a little discipline to attribute grammars, to try to keep them "nice", whether or not they map closely to a CSG or not, because in the end what we care about is the "nice" and not the fact that grammar happens to be a CSG.
Actually, before going further, I should amend some of my own exaggerations about attribute grammars. An attribute grammar can be used to make the underlying grammar more powerful, like adding context sensitivity to a context free grammar; but it doesn't have to. It can just derive semantical content from the parse tree, without affecting whether the input is considered syntactically well-formed or not. It's just that, in the usual applications, it's handy to track some of that semantical content as "context" and notice, early on, when context that has been deemed necessary (e.g. agreement in natural language, variable declarations in programming language) is not in its proper order.
It's interesting to read a chapter of a compiler textbook that covers attribute grammars, now that I've decided (mostly correctly, I think) that they're essentially the same as DCGs. In some cases an attribute grammar may be "many-visit" -- it may visit the nodes of the parse tree more than once -- well, that's backtracking, isn't it. And any L-attributed grammar may be transformed into an S-attributed one by deferring the computations; have the synthesized attributes return data structures whose missing values will be provided by some higher-level production, and when these values are provided, the computation can proceed -- gosh, that sounds quite a bit like unification. The thing, it's all rather ad hoc and oriented towards doing things with as little backtracking as possible. Which looks quite different from a millieu where backtracking is just an accepted fact of life.
Now for the bad news: almost nothing I've said above scales up to 50,000 words.
My own fault, for choosing a set of goals that are not really mutually compatible.
I remember now why I haven't programmed in Prolog in years. It fails to appeal to me because the "prolog" always seems to end up more "pro" than "log". That is, it provides facilities that violate the purely relational (or "logical", if you prefer) paradigm, by working on the operational level of the execution. Prolog provides these facilities so that people can write Prolog programs with performance that doesn't suck. miniKanren does not provide any such facilities. miniKanren's performance sucks.
I implemented the attribute grammar shown above, the one for the (context-sensitive, as you'll recall) "kennel story", in miniKanren. It was a straightforward translation. It can generate kennel stories, and it can recognize when it's given a correctly-formed kennel story.
When miniKanren returns answers, it returns all the answers it can find, and it returns them in order, i.e. the answers it found first, it returns first. It is basically doing a breadth-first search, so the answers it finds first are the shortest answers. If you want long answers, you can add a constraint to that effect; but you have to wait for it to consider and reject all the shorter answers first.
So the kennel story relation in miniKanren takes about 8 seconds on my machine to generate a 10-event (roughly 40-word) story. For a 12-event story, it takes 45 seconds. For a 14-event story -- five and a half minutes.
This is not the way to reach 50,000 words.
So I made a reformulation of the attribute grammar to try to make it more efficient -- divide and conquer. Instead of describing what is basically a linear sequence of events, the new grammar describes a tree of events (well, groups of events). This is more efficient. In 8 seconds, it can generate 64 events. To generate 128 events, it takes 28 seconds. To generate 256 events, it takes two minutes.
So that's better... but if you extrapolate from those numbers, this is still a very painful way to reach 50,000 words. I'd need to start generating on, say, November 24th in order for it to get finished on time. And hope that there's no power outage or other phsyical problem in that time.
So I am still looking at alternatives.
Would Prolog do any better? It uses DFS, not BFS, so it could find a long answer quickly, yes? Yes, but... those facilities that it provides for performance, the ones that violate the relational character -- they also interfere with running the programs "backwards". One of those facilities is Prolog numbers. And the obvious way to measure if you have at least 50,000 words or not, is to use the number 50,000. Maybe there are clever ways around this, but I have not yet been able to write the kennel story attribute grammar relation in Prolog in a way where I can run it both forwards and backwards.
I can make another clever reformulation of the relation, which might help make it generatable in miniKanren in reasonable time (still testing this), but it would also make the definition of the relation a bit ugly and expedient-looking, which raises a serious question: if I'm willing to contort my definition in order to accomodate the realities of executing it on a computer, why do I even want to write it as a relation in the first place? If I'm OK with "a bit ugly" I could be satisfied just writing it as a pair of imperative programs, one to generate and one to recognize, no? Because that's a bit ugly, for sure, but it gets the job done. (I actually have some further thoughts regarding this but I'll save them for later.)
There are other optimizations I could probably try to make at the code level -- basically, implement it more efficiently. I will probably try one such idea as soon as I understand it better. I'm not sure how much hope I should put in this though. This is a problem of scaling rather than of raw execution speed.
There are also experimental variations of miniKanren that I could look into. Resorting to using an experimental language for this seems a bit extreme though.
Or - I could just settle for generating a few thousand words, and tack on tens of thousands of "junk meows" to reach the 50,000 words goal, right? Ah, but that gets to the heart of why I care about NaNoGenMo in the first place. As was pointed out by @MichaelPaulukonis many years ago, NaNoGenMo isn't just a generative art jam, it's a long-form generative art jam. So a NaNoGenMo generator should generate long-form generative art, no? It can do that by generating lots of independent instances of short form generative art and pasting them together, sure, but how much does that really get to the true spirit of long-form generative art? To me, the best NaNoGenMo works have a single coherent context that runs through the whole thing, making it a single long form.
And for 50,000 words, that is actually a very challenging thing to pull off. And here, I'm just running up against a particular technical instance of that challenge.
So, yeah, the prospects at the moment look bleak, and this project of mine might not achieve its goals, but I wouldn't have it any other way.
A pivot has occurred and I should probably appraise y'all of that, of the new situation on the ground and all that.
But first I should correct some more sloppiness found in my previous missives.
The "kennel story", as written in that attribute grammar I gave, does not actually need a context-sensitive grammar. There are only 3 animals, so there are only 2^3 = 8 possible states the kennel can be in. So you could actually write a context-free grammar (or a regular grammar in fact) with 8 productions, where each of these productions represents one of these states, and has a number of transitions to the possible next-states of the kennel.
But the kennel story as written is not the kennel story that I was trying to get at. If there is no bound on the number of animals in the kennel -- for instance, say there's, potentially, a dog and a brown dog and a light brown dog and a light light brown dog with dark dark dark black spots, etc etc in the kennel at the beginning of the story -- then this does require a CSG, because there's no finite number of productions you can choose to represent the transitions between kennel-states.
There's two things though.
The first is, this is not quite the same context-sensitive language as I was thinking of when I wrote up that attribute grammar, because I was misremembering what I had seen in the literature. So I might switch to a simpler story, with a more well-known basis, where there are only two animals, but when they run away, they do it in degrees. They run away, then they run further away, and then they head back, and finally they arrive back home.
(Incidentally, I think "The Credible Journey" might be a good title for this story.)
(And even this might prove too much and I might have to settle for an even simpler novel. But we'll see.)
The second is, even though the kennel story as it's currently written only needs a regular grammar, I'm still not sure what grammar-based story generation tools out there would be able to handle it, because it is a non-deterministic grammar: at the end, all the animals do need to be back in the kennel. I suspect most tools don't handle this by default, and you would need to rewrite it as a deterministic grammar to use them. But I haven't really looked into this.
So yeah, that pivot.
My goals this year have included generating a story from an attribute grammar. I've done that now, though not to 50,000 words.
My goals have also included defining a novel scheme using a relation -- that is, a grammar -- that is, a logic -- and then coding it in a relational language and having it be run both "forward" to generate a novel from that scheme and "backward" to recognize those novels that could have been generated by that scheme.
(This grammar wouldn't necessarily have to be an attribute grammar, but it would be nice for it to be non-context free. It would be less satisfying somehow if it were merely context-free.)
I tried coding such a relation in two relational languages, Prolog and miniKanren. I saw how much I would need to contort it to make it executable in both directions in miniKanren with a reasonable execution time. And I saw how much Prolog was inherently contorted away from the potential of running programs in both directions.
This left me looking elsewhere.
So I wrote my own relational engine. You give it a relation in the form of a grammar (which can be an arbitrary grammar -- context-free, context-sensitive, or even unrestricted), a string to start from, and a direction to process the string in. It then processes the string, creating rewritings of it following the grammar productions. When it has finished translating all the non-terminals to terminals or vice versa, it outputs the result.
Once I got this working in its most basic way, I extended it to be able to generate 50,000 words without taking an eternity nor exhausting all the memory of my machine. This was done by adding a search heuristic (search for a long string of non-terminals, then once that's found, search for strings that are mostly terminals) and implementing beam search (BFS, but throw away all but a handful of the candidates, to keep the memory footprint reasonable.)
Because it is a very general, non-specialized algorithm, it would still take hours to generate a 50,000 word novel from a context-sensitive grammar. And not much less time than that to parse it.
But concocting context-sensitive grammars is arduous, certainly not something I want to spend my time on; so the examples in the literature are almost always simple examples. A novel generated from such a grammar would be a trivial thing. And would take hours to generate. I honestly don't want to spend all those cycles on such a task. It would prove very little. I'm satisfied the relational engine basically works, so it would prove very little. I don't even want to link to it here. If you read this far you know it does the needful by using beam search, and that is literally the only edifying thing about it, so there's nothing more you need to know.
So where does this all leave me? I dunno. There are other ideas, other paths that could be followed (good grief, are there! there are enough rabbit holes here to fill three lifetimes' worth of TODO lists.) But not with only a week or so left. There'd be too much pressure, and I don't want to be under more pressure actually thanks.
I don't really need another green "Completed" label, do I? Well no, but, you know, force of habit, point of pride, finish what you started and all that.
Fifty thousand meows, anyone?
I don't really need another green "Completed" label, do I? Well no, but, you know, force of habit, point of pride, finish what you started and all that.
No, taking it on the chin here would be much more honest. This year, NaNoGenMo has defeated me, and I will leave defeated.
Closing ticket as WONTFIX, reason: EBARTOOHIGH
A cockamamie epilogue to this cockamamie episode, for anyone who might still be listening:
I did eventually manage to clear the bar I set for myself. It just took a lot longer than November -- it took until June. I now have, though:
I also learned a lot about context-sensitive languages in the process.
The grammar formalism and the tool are still in a primitive state; there are lots of avenues along which they could be developed. Randomness during generation, for example, is not yet well supported, and the tool cannot ensure the language described by the grammar is actually a context-sensitive one rather than something even more complex.
I might turn my attention to these things and try to enhance and use this formalism and tool for NaNoGenMo 2023. The kennel story is trivial; how far could this be pushed? Could one write "The Swallows" in this? Or "A Time for Destiny"? Or "The League of Extraordinarily Dull Gentlemen"? If not, how close could one come?
Or, I might not. I do have a couple of other ideas that I might like to try out instead.
I have therefore titled my issue accordingly.
I may continually update this issue as I work over the course of the month of November. I will feel free to post dev diaries, sample output, etc.