Zistack / program-modeling-language

A project with the goal of designing and implementing a program modelling language that allows for the efficient evaluation of arbitrary static assertions about the program's behavior.
0 stars 0 forks source link

Bosque language #6

Open JacquesCarette opened 3 years ago

JacquesCarette commented 3 years ago

Bosque is something I saw just today. Some (and, of course, just some) of its ideas seem to be along similar lines. There might be something to learn from it.

I definitely appreciated how the language was presented. It made sense to me.

Zistack commented 3 years ago

While their goals are definitely related, that's about where the relationship ends. The situation here is basically exactly the same as with the paper that you linked to here.

Lambdas just don't work. Anything that is based on them is ultimately pursuing a completely different foundation than I am.

If we want to talk about presentation - they're relying heavily on the fact that the reader has probably used other programming languages before, as the constructs in Bosque are similar. I can't do this, of course, since the structures in my language are not similar to anything which any existing language that I know of has.

JacquesCarette commented 3 years ago

The reason to point out related work is not to say that you're not doing something novel (you clearly are), but to enable you to get some insights at a lower cost than having to find them yourself. I'd be surprised if there was nothing at all for you to learn from here, even if you indeed are doing something different.

... presentation... I can't do this, of course, since the structures in my language are not similar to anything which any existing language that I know of has.

You still have to find a way to illustrate your ideas in ways that a reader will understand. This is still the principal stumbling block.

[I'll also respond to the other items, over time.]

Zistack commented 3 years ago

The reason to point out related work is not to say that you're not doing something novel (you clearly are), but to enable you to get some insights at a lower cost than having to find them yourself. I'd be surprised if there was nothing at all for you to learn from here, even if you indeed are doing something different.

That's all well and good, but let's not forget my 7 years of experience here. I've been paying attention to how various projects throughout time have handled various aspects of this overall problem. Fundamentally, Bosque is just combining restrictions on loops and recursion (which we also see in all strongly normalizing languages, as well as several imperative ones (typically targeted at GPUs or other sorts of embedded hardware)), and the sort of solver integration we see in stuff like Liquid Haskell.

In other words, even without seeing the project, I'm already familiar with what the consequences of those particular design decisions will be, at least approximately. There isn't really anything to learn here that I don't already know.

Additionally, for a project so thoroughly detached from convention as this one is, there really are very few ideas (Except for what not to do) that can be transferred from existing projects. I'm operating on such a different foundation that most existing ideas about how to do things fail to satisfy the underlying constraints on a pretty fundamental level. In practice, after translating what they're doing into my perspective, I end up having to discard pretty much every idea they've tried to compose, because each part violates some constraint or other. Maybe it's the reliance on match expressions or lambdas. Maybe it's node-focused nature of the analysis that can't be generalized without running into the graph grammar equivalence problem. It doesn't really matter. I think the most interesting stuff that I have seen that has relevant insights is stuff like vectorization and control flow optimization - typically as applied to GPUs and stuff. This is because they're fundamentally trying to manipulate the structure of the program, which is a little different from what most people set out to do when attempting to build analyses (for some reason).

I appreciate that you are trying to help me here, but when you show me languages that are just remixes of stuff that has already been done, it's not even interesting - it's just noise.

You still have to find a way to illustrate your ideas in ways that a reader will understand. This is still the principal stumbling block.

My point was that saying

I definitely appreciated how the language was presented. It made sense to me.

when referring to documentation like this just tells me that you are familiar with existing programming languages, and can fill in gaps with standard assumptions - which is fine, but not helpful. I already know that I have to find a way of getting the reader to understand what I am saying. Referring to a documentation strategy which I cannot employ as an example of good documentation doesn't help me figure out how to do this at all.

JacquesCarette commented 3 years ago

Having meaningful conversations requires first understanding each other's context. I'm gradually learning more of yours (and hopefully you are learning some of mine). I'm persevering, even though the going is clearly rocky. It would likely be much easier in person, as the misunderstandings could be cleared up much faster, and before they blow up [such as here].

Yes, you can take that as yet another apology.

Is there no way you can create some toy example that would illustrate what you really want to do? What the problem is? Your solution ideas are so 'foreign' (at least in their language - I think the underlying ideas are not) that having analogical presentations of the problem could be an effective communication method.

For example, the Principles of Program Analysis book ends up using sets-of-sets-of-nodes (for tracking program points in various analyses), but manages to not make them so scary. Eventually, the various Galois connections are relations between these. But they don't go deeper.

Zistack commented 3 years ago

Having meaningful conversations requires first understanding each other's context. I'm gradually learning more of yours (and hopefully you are learning some of mine). I'm persevering, even though the going is clearly rocky. It would likely be much easier in person, as the misunderstandings could be cleared up much faster, and before they blow up [such as here].

Yes, you can take that as yet another apology.

This one is probably on me as well. It is clear that we're having a lot of trouble establishing a shared context. I created a new issue (#7) that aims to address this. Evidently some sort of mapping needs to be established between convention and my foundation, or else this sort of interaction is doomed to repeat itself. Without understanding the perspective that I am adopting, it's not surprising that you won't be able to recognize when a paper with related goals fails to present any useful information. You essentially lack the tools to make that evaluation.

Is there no way you can create some toy example that would illustrate what you really want to do? What the problem is? Your solution ideas are so 'foreign' (at least in their language - I think the underlying ideas are not) that having analogical presentations of the problem could be an effective communication method.

This is a slightly different task than what I pose in #7. As far as examples go, not really. The problem is that existing languages don't really support canonization at all, so I don't know what language I would present my toy example in. I could instead focus on the part where I want to be able to evaluate arbitrary static assertions (which always boil down to a single SE or 'specialization' query), but there are caveats due to the way that representational transparency doesn't speak in terms of absolute upper bounds on query cost. I could probably document those caveats, but does knowing that I want to support the evaluation of arbitrary static assertions really help you understand what I am doing here?

As far as interpreting 'what I really want to do' as 'design a language which is analyzable', it isn't obvious how that would help either. For example, I could point at regular expressions as a model of computation which is very analyzable, but there aren't really any lessons we can transfer from regular expressions over into this domain. I can't think of any approach to doing this where the example would actually provide any insight.

As far as showing what the problem is - well, I could talk about individual problems, like how match expressions cause the compiler to have to solve SAT in order to answer questions, or how the distributive property is a bad thing. This might help people develop a practical understanding of why existing solutions don't work, but I don't know that it helps them understand the approach that I am taking instead.

For example, the Principles of Program Analysis book ends up using sets-of-sets-of-nodes (for tracking program points in various analyses), but manages to not make them so scary. Eventually, the various Galois connections are relations between these. But they don't go deeper.

This is the kind of comment that would cause me to read a whole book in a day. It's a shame that I don't have access to a copy right now.

That said, what about my presentation is scary? That's not the sort of word I would typically associate with mathematical definitions.

JacquesCarette commented 3 years ago

Evidently some sort of mapping needs to be established between convention and my foundation, or else this sort of interaction is doomed to repeat itself.

Agreed!

The problem is that existing languages don't really support canonization at all, so I don't know what language I would present my toy example in.

If I understand 'canonization' well, then it is quite similar to 'normal forms'. For example, univariate polynomials have a normal form. Free Monoids do too -- but constructive, Commutative Free Monoids don't unless you make decidability assumptions.

For straight-line programs, I believe SSA is a normal form, once you fix an ordering on variables.

As far as showing what the problem is - well, I could talk about individual problems, like how match expressions cause the compiler to have to solve SAT in order to answer questions, or how the distributive property is a bad thing. This might help people develop a practical understanding of why existing solutions don't work, but I don't know that it helps them understand the approach that I am taking instead.

Here I'll heartily disagree. The number of times I've asked "Please tell me what problem you're solving" and it turned out that it had never been made precise is huge. And, when it was made precise, the explanation (and adequacy!!) of the solution became 1000% clearer.

This is a big disease in CS education: being so solution-driven that there is no pause to even ask if the solution actually fits the problem. [I'm not suggesting that this is the case here.]

Oh, and here you go, an online PDF for that amazing book.

The scary part are those two points I made at the start.

Zistack commented 3 years ago

If I understand 'canonization' well, then it is quite similar to 'normal forms'.

My understanding is that these terms ('canonical' and 'normal') are largely synonymous, except that people sometimes use the term 'normal form' even when there are multiple 'normal forms' that are equivalent but distinct. I don't recall having seen the term 'canonical form' used without some implication that equivalence implied identity, which is why I've been using that term.

For clarity, what I'm saying is that, when a program is in canonical form, then any other program that is equivalent will have the identical canonical form.

For straight-line programs, I believe SSA is a normal form, once you fix an ordering on variables.

Right, but SSA would fail my definition of a 'canonical form' because there are many different SSA expressions that I would consider to be equivalent programs.

Here I'll heartily disagree. The number of times I've asked "Please tell me what problem you're solving" and it turned out that it had never been made precise is huge. And, when it was made precise, the explanation (and adequacy!!) of the solution became 1000% clearer.

I am familiar with this problem. I suppose making a page about the general motivations of the project would not be hard, if you really think that it would help.

Oh, and here you go, an online PDF for that amazing book.

Thanks a lot. I'll be reading that soon.

JacquesCarette commented 3 years ago

I have worked in 2 different communities that use 'canonical' and 'normal' differently, that's why I ask. In any case, we have now achieved shared vocabulary on 'canonical'. And it is a standard use in PL, so that's good too.

Agreed on SSA being sub-optimal for your purposes.

FYI (and it's related to explaining problems and illustrating things well) the paper Algebras for Weighted Search is superb. It has nothing to do with the problem you're tackling, really, but it's such a good read, I can't help but pass it on. Even though they get eventually extremely general, they constantly illustrate everything with very concrete code/examples. They even use 2 programming languages, because they (correctly) feel that the simpler one is not adequate to really show what they are interested in. Nevertheless, that's where they start, to get as wide an audience as possible.