Zistack / program-modeling-language

A project with the goal of designing and implementing a program modelling language that allows for the efficient evaluation of arbitrary static assertions about the program's behavior.
0 stars 0 forks source link

What are you trying to model? #1

Open JacquesCarette opened 3 years ago

JacquesCarette commented 3 years ago

I've read all the pieces on the wiki, and I'm still missing the bigger picture that doesn't allow me to "get" it.

The pieces that I think I understand:

Zistack commented 3 years ago

The pieces on the wiki are a small fragment of the overall picture. I've yet to get all of my ideas down on paper, as many of them build on each other, and I am already struggling to find suitable descriptions of what I've got so far.

Let me see if I can give you a big picture view of how I am approaching the design of this language.

The fundamental goal here is to come up with a language in which programs are efficiently canonizable (for some definition of 'efficient'). I want to make queries like "Is this program a more specialized version of that program?" and "Is this program the same as that program?" cheap.

A program, in its purest form, is basically just a map from some set of inputs to some set of outputs. What I am doing is trying to come up with a language which allows us to describe those mappings in a way that is canonizable so as to support the cheap evaluation of those aforementioned queries while still being expressive/powerful enough to actually represent real programs. I am targeting the strongly normalizing class of languages in power, though I don't know that a non-functional programming language can actually meet the definition of strongly normalizing as it is commonly given.

The parts that I have described so far are ultimately about describing relationships between different parts of a program's state. I am currently building a language that just describes the relationships between inputs and outputs, without worrying about how exactly the computing machinery will actually evaluate these things as programs. As such, I have not given a computational interpretation to the primitives documented in the wiki - only relational ones. Moreover, I haven't talked about how you would label different parts of state as being 'inputs' or 'outputs' yet, as that would rely on features that also hasn't been specified yet.

Aside: I am thinking about the evaluation machinery, of course, as I would like to build real tools for this language someday, but the interesting analysis properties of the language do not depend on knowing about how computation is performed.

Here's an overview of the current state of the core language design itself.

Variables and enums form the backbone of the language's ability to talk about structures and values, respectively. I've focused more on structures than values up to this point, so there is more written about how to apply relationships to variables than there is about applying relationships to enums. I need to define how relationships over enums are given, and how they may be used. I also need to define how compositions of primitive structures, values, and relationships can be composed to define new sorts of structures, values and relationships. This particular bit is what our proper definition of what a 'program' is is waiting on.

The most recent additions (the encoding-related stuff) allow for relationships between data structures in terms of the language's structural primitives and data structures encoded as data. This covers the sorts of relationships that you are ultimately describing when writing parsers and such.

I still need more sophisticated structural relationships that support grids and multiplication and such. I also need relationships for supporting things akin to graph traversal operations and reduction operations over multi-dimensional structures.

I have pretty good ideas about what sorts of machinery I want to include for all of these things, but I have yet to figure out how to specify it all.

I hope this helps.

JacquesCarette commented 3 years ago

That helps a lot!

You do know that

The fundamental goal here is to come up with a language in which programs are efficiently canonizable

is wildly undecidable, right? That doesn't mean you shouldn't try, but rather that you should either work on programs where this is feasible, or "build in" that this is going to be a partial operation. Ah, you do seem to know this, since you later say

I am targeting the strongly normalizing class of languages in power

So I would take a specific such language as a starting point. That will help make things concrete. More on that later.

A program, in its purest form, is basically just a map from some set of inputs to some set of outputs

True - but not necessarily useful. Pursuing this view too naively leads to consider the graph of a function as a set of pairs. With excluded middle (and a bit of choice), in classical math, this all works out fine, because you're not worried about effective computation. Whereas here, the exact opposite is what you want.

The parts that I have described so far are ultimately about describing relationships between different parts of a program's state.

Now I better understand the clash. Thinking in terms of state makes a lot of sense for imperative programs. It still makes sense for looking at the unfolding of a functional program, but isn't necessarily as natural. More fundamentally though, it is perhaps not where you'll see where the interesting stuff is going on.

Which you know: this is most likely you've been talking about various graphs all along, which would capture the relations between (parts of) the state. And relations between relations, as you were talking higher-order things too. Is that correct?


My impression is that you've jumped ahead too far, too fast, with your language design. [I'm not saying any of it is wrong, BTW.]

What I would do, if I were trying to do this (and I have done some of this in the past, but nothing published) is:

  1. fix an idealized programming language whose equivalences I want to analyse. Start with very few features.
  2. write out some (small) programs in that languages, in pairs. Make some of the pairs equivalent, and others subtly inequivalent.
  3. hand write the reasons why the conclusion is true for each pair. The more detailed, the better.

Just yesterday, I came upon the very recent paper Practical Normalization by Evaluation for EDSLs which is strongly related.

Zistack commented 3 years ago

So I would take a specific such language as a starting point. That will help make things concrete. More on that later.

Yeah, I can't do this. I would have to start by removing the lambdas, which sort of defeats the approach.

Which you know: this is most likely you've been talking about various graphs all along, which would capture the relations between (parts of) the state. And relations between relations, as you were talking higher-order things too. Is that correct?

Yes, exactly. I haven't really gotten into the higher order relations yet, but they'll be there. The structure of the relationships between the data is where all of the interesting information in a program lies. You'll notice that my variables say essentially nothing about a program's possible states, but instead merely provide a handle by which to refer to the intersection of different relationships.

My impression is that you've jumped ahead too far, too fast, with your language design.

Some additional context: This particular language design is attempt #3. I have discarded two prior attempts, because they ended up running into fundamental problems that simply could not be overcome in the context of their approach. I'm also something like 7 years in to the project now. I have not been moving all that fast.

The good news is that I believe that this attempt will actually succeed. I recently (about the same time I added the encoding primitives) figured out how I would go about solving the last major problem (recognizing and eliminating redundant computation) that stands between me and representational transparency. In other words, I've all but accomplished my objective (The 1% inspiration part is complete, now it's time for the 99% perspiration). The only real issue that I am having right now is that I don't know how to communicate what is in my head in a way that other people will understand.

This language design is coming from a fairly advanced understanding of the constraints that the language needs to satisfy in order to actually have the desired properties. I would just write about the constraints first, but because I lack a suitable language for talking about the structures of languages and their consequences, I don't really know how to actually communicate these constraints or why they're the right constrains without referring to some kind of example. I figured that formalizing and documenting the language that I had been developing in my head to play around with these ideas would give me some way of talking about the reasons behind the design choices that I have made, and the constraints that I am aiming to satisfy. In a way, I've sort of found myself in a sort of chicken-and-egg problem, due to how thoroughly detached this project is from convention.

:thinking: Though, since I actually do understand that I've found a more-or-less complete set of constraints as of very recently, I might have a few more options for talking about them now...

What I would do, if I were trying to do this (and I have done some of this in the past, but nothing published) is:

  1. fix an idealized programming language whose equivalences I want to analyse. Start with very few features.
  2. write out some (small) programs in that languages, in pairs. Make some of the pairs equivalent, and others subtly inequivalent.
  3. hand write the reasons why the conclusion is true for each pair. The more detailed, the better.

I am not sure how this helps me communicate my ideas to others. I could sort of explain why two Java (probably not what you have in mind as an 'idealized' programming language) programs are equivalent, for example, but that does nothing for explaining how the language I actually want to design works, or the constraints that a representationally transparent language would need to satisfy.

Just yesterday, I came upon the very recent paper Practical Normalization by Evaluation for EDSLs which is strongly related.

It may be related, in principle, but the contents do nothing to help here. Too many of the fundamental assumptions about the underlying languages involved violate constraints that need to be satisfied in order to achieve the desired properties.


If I were just left to my own devices, most likely I would just continue specifying my language as best I could. Eventually, I would get a clear enough picture written out that I would start building tooling for manipulating these expressions and ultimately implementing the canonization and inference algorithms of interest. I doubt that this would be particularly satisfactory to the PL community, and I don't think I would be all that pleased with not having nice formal definitions/proofs of the desired properties either, even if I am otherwise confident in my arguments. I'm also not convinced that being able to view such an implementation would go very far toward helping people understand why any of it works.

JacquesCarette commented 3 years ago

Ok, good. I think a lot of communication has happened.

I'm willing to continue to be a guinea pig for your explanations. I'll continue prodding too.

Regarding "taking a specific total language for a starting point": Ok, I think I understand why you don't want to do this, and I think it makes sense too. So that was a dead-end. Nevertheless, I think the basic idea is sound: pick some language (languages?) that will be the target of your analysis. Otherwise it's all going to be much too vague, and no one beyond you will be able to understand what you're trying to do.

Re: "moving too fast". Thanks for the extra context. Clearly I chose the wrong words to express myself too. What I mean is that your explanation feels like it's jumping ahead. The issue here is likely one of communication: too much tacit knowledge in your head that is needed to understand what you're trying to do. So take my questions as being more along the lines of "I don't understand, please give me something more concrete". In return, I'll assume that you've got 7 years of work done on this and are on your 3rd try ;) .

The only real issue that I am having right now is that I don't know how to communicate what is in my head in a way that other people will understand.

This is where I'm willing to continue to be your guinea pig. I certainly have the capability to understand what you're doing.

I figured that formalizing and documenting the language that I had been developing in my head to play around with these ideas would give me some way of talking about the reasons behind the design choices that I have made, and the constraints that I am aiming to satisfy.

I recently ranted on twitter about an otherwise good paper that was showing a new DSL about this backwards choice of narrative... I then linked some papers that I thought did it well. Instead, start from the other end: give concrete examples of what you're trying to achieve, and why that's a good thing. Then distill requirements. Then formalize and document the language.

I could sort of explain why two Java (probably not what you have in mind as an 'idealized' programming language) programs are equivalent, for example, but that does nothing for explaining how the language I actually want to design works, or the constraints that a representationally transparent language would need to satisfy.

That's correct, and that's because there are steps in between that need to also be filled-in for that to make sense. Sure, Java, why not. The key is to pick two Java programs whose reason for being equivalent has an explanation that reveals some of the requirements of your language. The selection of programs would be far from random, but picked expressly to illustrate the problems that you now know how to solve. A second step would analyze the various ad hoc explanations to show their underlying structure. And that would be where your language design comes in.

Does that make sense? It's kind of a Socratic approach whose end goal is to explain your design and the constraints it must satisfy. But, to get there, you need to start from "the problem space", in a way that is concrete. At least, I've found that that is the surest way to communicate effectively. Some people (I'm thinking of people like Simon Peyton Jones in particular) are grand masters at taking esoteric PL problems with complex solutions, and given a crystal clear, concrete illustration for why the problem is interesting and needs solved.

Re: the NbE paper. Now that I understand you want to deal with imperative, effectful programs, I also understand that it doesn't help at all. It's still a great paper... just not really relevant.

Zistack commented 3 years ago

Nevertheless, I think the basic idea is sound: pick some language (languages?) that will be the target of your analysis.

This is ultimately what I was trying to do by spending more effort on formalizing the one that I've been using in my head. The problem with existing languages is that they all fail to meet important constraints, usually at a pretty fundamental level. It is quite possible that few to none of them will ever be efficiently analyzable, even by way of establishing a mapping to the language that I would design (or another one possessing the same properties of interest).

Re: "moving too fast". Thanks for the extra context. Clearly I chose the wrong words to express myself too. What I mean is that your explanation feels like it's jumping ahead. The issue here is likely one of communication: too much tacit knowledge in your head that is needed to understand what you're trying to do. So take my questions as being more along the lines of "I don't understand, please give me something more concrete". In return, I'll assume that you've got 7 years of work done on this and are on your 3rd try ;) .

Apology accepted. Tacit knowledge is definitely to blame. There's stuff that I've spent a great deal of time thinking about that I haven't even gotten around to mentioning to you yet.


On that note: I don't know how useful it will be, but you might be interested in knowing a little about how this process started, and about the other two attempts. If nothing else, it can give you an idea of what sorts of problems I've considered, and what sort of information I've gathered along the way for some shared context.

It started with wanting to build a thread-pool/task-based framework for writing programs that would allow the programmer to let some tooling handle some of the burdens of dealing concurrency for them. At the time, I was operating mostly in C. In particular, I was interested in some burdens that aren't talked about as much as correctness - efficient work chunking and scheduling. Unfortunately, I realized that the basic unit of such a program was something that would be completely opaque to pretty much any compiler of a standard language, so I realized that I needed a new compiler, and a new language to go with it. One that would allow me to see the structure of the flow of data through a program so that the compiler could actually manipulate and optimize things about this structure. This was the fundamental premise of attempt #1, which resembled a functional programming language based off of the pi calculus.

Attempt #1 failed miserably when I started thinking about arrays and references. Basically, as soon as the structure of your data becomes encoded as data, you can no longer reason about it anymore - though I didn't realize that this was the exact failure I was dealing with at the time. What I did realize at this point was that the properties that I needed in order to accomplish this goal were not specific to generalized automatic parallelization at all. They were basically related to being able to ask about program equivalence. I already knew enough about the some of the pitfalls that I might run into if I pursued that route, knowing already about the halting problem and having read about the various attempts at building knowledge databases (knowledge compilation) in the past, which were typically aimed at making the analogous operations for whatever logic they were using efficient. I also realized that dealing in terms of numbers, indices, and memory was a dead-end, so attempt #2 was a language based on graph rewriting.

Aside: I did not learn about the strongly normalizing class of languages until I was part-way through attempt #2. I knew I wouldn't be designing a Turing-complete language when I started, but I didn't really have words for what I was doing. I figured I'd just build something that seemed usable for real programs while also being analyzable without worrying about it too much.

Attempt #2 was interesting, since that's when I learned about some of the limitations of shape analysis using pure graph grammars. Turns out that context-free and context-sensitive graph grammars are the same class. Also, regular graph grammars aren't powerful enough to express real programs. Oh, and the problem of determining whether or not two context-sensitive graph grammars are equivalent is undecidable, in general. I played around with the idea of trying to build some sort of hybrid grammar system for specifying programs, but ultimately I think I never managed to escape the fundamental flaws in the foundation that give rise to the undecidable nature of context-sensitive graph grammar equivalence. Fortunately, besides learning that useful bit of context about graph grammar equivalence (which did come in handy later), I also realized something else. I was still capturing way too much information about the way that the program was evaluated.

I distinctly remember sitting in my office one day (I was in grad school at the time) thinking about what information was actually encoded by a program if a program was viewed as a black box by an observer. I realized that, no matter how you sliced it, you could always distill it down to the structure of the relationships between abstract nodes whose distinct identity is their only interesting property. The trick was dealing with the intermediate structure. I knew I needed to canonize my programs before (I figured it out sometime during attempt #2), but now I had an idea of what information it was that I really needed to be focusing on. The concept that I have been calling the variable in these notes was conceived at this point.

At first, I started focusing on just talking about the sorts of mappings that I could describe between the nodes of these variables and such. I even half-implemented a prototype. I realized pretty quickly that I was on to something. I had discrete parts that were arranged in structures that were actually pretty easy to compare to other structures. Instead of establishing a baseline for comparison, I really just had to worry about normalizing things that should already be pretty close to what I'm looking for. I just needed about one more insight to get to considering the language that you would recognize in these notes.

I actually only figured out that language feature orthogonality was a big deal during attempt #3. It became clearest when I was thinking about how you would introduce features like map-reduce, and taking the product of two variables (note: these are both operations for which some analog is planned in the language that I am designing, but neither analog has been defined yet). Turns out the 'map' part of map-reduce (establish a collection of spaces where each position is associated with a node from the inner variable, and the whole space is associated with a node from the outer variable) conflicts with product (establish a 2-dimensional space where each position is associated with a node from each variable). Indeed, map was arguably in conflict with itself, too, since it was basically an asymmetric way of talking about a weaker version of the product (I assumed the existence of reduction operators (that I still plan to add) that could talk about relationships between adjacent positions in these spaces - map segments the spaces more than product, so some adjacencies are inaccessible with only one application) and could be given in either orientation, obfuscating program equivalence.

Anyways, after backing off and thinking about it for a bit, I had a better idea of the orthogonality principle that I needed to satisfy, and started designing the language that I've been presenting to you up to now.


Back to the discussion at hand here...

This is where I'm willing to continue to be your guinea pig. I certainly have the capability to understand what you're doing.

I think so too. I just have to figure out where I can connect enough of what's going on in my head with what someone else can understand to actually establish some sort of common language.

Instead, start from the other end: give concrete examples of what you're trying to achieve, and why that's a good thing. Then distill requirements. Then formalize and document the language.

This is the approach I would prefer, and the one that I tried to start with. I just was having trouble actually doing it, so I tried doing something else. This turned out to be a good idea, but for totally different reasons. In my attempt to operate by example, I managed to find the missing piece of the puzzle that I've been trying to solve. That said, I think I will return to the approach that you suggest now.

That's correct, and that's because there are steps in between that need to also be filled-in for that to make sense. Sure, Java, why not. The key is to pick two Java programs whose reason for being equivalent has an explanation that reveals some of the requirements of your language. The selection of programs would be far from random, but picked expressly to illustrate the problems that you now know how to solve. A second step would analyze the various ad hoc explanations to show their underlying structure. And that would be where your language design comes in.

So, we might be able to skip this - or rather, cash in on all the time I've already spent doing that in my head (on an abstract notion of programs that isn't easy to write out - usually focused on integer multiplication (it's an amazingly useful stress test for a lot of this despite being such a simple program)). Plus, I'm not convinced that I could do this with a language that you would recognize anyways, due not only to the fact that I'm pretty sure that no existing language is even remotely analyzable in the way that I want, but also to the fact that there are some open questions about what the important parts of a program's interface actually are for the purpose of equivalence. If I have a simple binary-coded-decimal multiplication algorithm implemented using the shift-and-add approach, and a FFT-based multiplication algorithm that operates on binary numbers directly, are they equivalent because they do fundamentally the same thing, or are they different because they technically use different types to represent their inputs and outputs? So far, I've basically been assuming structural equivalence of the relationships. If we were to cast this particular example into my language, the analysis would treat them as the same, because the actual number of values per digit is considered a variable for the purpose of the analysis - all n-ary number systems are the same - but that might not be so easy to do in another language. Moreover, if I were to use something like unary or roman numerals as an interface, even in my language, my definition of equivalence would consider a multiplication algorithm over those types of numbers to be distinct, which might not be what I want (though I might be asking for something that I literally don't have enough information for if I want to show that they're the same somehow). Similarly things like negabinary and many other number systems that have been devised.

Anyways, after realizing that I have something I've never had before - namely, knowledge that the set of constraints that I've gathered so far are actually a complete set - I've been thinking about what I could do to talk about the constraints more directly. I've already made some interesting observations. The most interesting one is that the core constraints can actually be reduced to two actual formal constraints on the design of a language. There is also a third constraint that I've identified as well that is likely needed to actually do things in practice. It looks like what I am calling representational transparency is the property of interest for a language as presented to a human, but a stronger one is emerging that is both achievable and also likely necessary to perform actual analysis. The key observation there is that programs in languages with the weaker property can be efficiently transformed into equivalent programs in languages with the stronger property.

I have some pretty clear ideas of what the constraints are and how to define them formally, but to really communicate clearly, I need to take a moment and define the term 'programming language'. The definition that is usually taught in the classroom (a set of strings with an attached formal semantics defined separately) does not work here. That definition fails to capture that the language that I would get by taking an imperative language, allowing the programmer to assign names to the program states that we usually would think of as occurring 'before' and 'after' a given statement, and then also allowing the programmer to now list those statements in any order is essentially indistinguishable from the language that we started with from an analysis point of view.

My plan right now is to try to get an initial sketch of both the definition of a programming language that is actually useful here (I have a pretty good idea of what I need, but not so good an idea of how to formalize it properly), and the core constraints that I've identified, in terms of this definition. The initial descriptions of these ideas as they'll appear in the wiki are probably gonna be pretty sketchy, but this should still give us some concrete common ground to start with that isn't likely to depend very much on tacit knowledge. With any luck, I'll get this done within a couple of days.

Zistack commented 3 years ago

I have added new pages:

I have not yet added the third constraint yet, as I am much less certain about how to define it. Fortunately, it is much less important than the two core constraints. I think we can have a useful discussion even without having the third constraint properly defined yet.

JacquesCarette commented 3 years ago

Those pages are at such a high level, I'm not really even sure what you're talking about. Structural Reasoning in particular is desperately in need of concrete examples.