diprism / perpl

The PERPL Compiler
MIT License
10 stars 5 forks source link

Syntax for source language #2

Closed davidweichiang closed 3 years ago

davidweichiang commented 3 years ago

Most of the syntax is in the paper draft and is fairly standard, but

davidweichiang commented 3 years ago

For distributions, maybe something like

distribution ::= categorical type | uniform type | amb type

where categorical means that the probability of each outcome will be learned and amb means give probability 1 to every outcome.

And maybe we need something like

distribution ::= { Heads: 0.5, Tails: 0.5 }

davidweichiang commented 3 years ago

It seems like we also need the ability to define a distribution globally, like

dist coin = uniform bool

so that later on we can say

sample coin.

davidweichiang commented 3 years ago

Or, should we just add a real type and write distributions as functions returning real? @ccshan your guidance would be helpful here.

ccshan commented 3 years ago

In general I'd recommend following the paper, but the main concern I see is how to translate to an FGG whose nodes each take value in a finite domain and whose edges are each labeled with one of a finite number of factors. Maybe it's best to relax the latter constraint on the FGG format: should each edge be labeled with an expression that maps endpoint values to density real? Then, if the source language has a real type and arithmetic operations on it (but no uncountable distributions), would all the real-typed nodes simplify away before the FGG is produced?

davidweichiang commented 3 years ago

Since a factor is defined as a function in the FGG paper, I thought a finite number of factors would be equivalent to a finite number of expressions?

Anyway, we can add reals to the language. Since for the present we are only considering finite domains for random variables, how do we state the constraint in the source language to ensure that the FGG does not end up with a real-valued variable?

On Apr 8, 2021, at 22:53, Chung-chieh Shan @.***> wrote:

 In general I'd recommend following the paper, but the main concern I see is how to translate to an FGG whose nodes each take value in a finite domain and whose edges are each labeled with one of a finite number of factors. Maybe it's best to relax the latter constraint on the FGG format: should each edge be labeled with an expression that maps endpoint values to density real? Then, if the source language has a real type and arithmetic operations on it (but no uncountable distributions), would all the real-typed nodes simplify away before the FGG is produced?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

ccshan commented 3 years ago

Actually I just realized that just translating "functions returning real" results in real-valued variables in the FGG. So instead of adding reals to the language, how about specifying each distribution in the source language as an expression (in terms of the inputs to the factor) that is barely parsed by the compiler?

davidweichiang commented 3 years ago

Ok, at least for now, then, we can just implement amb, uniform, fail.

On Apr 10, 2021, at 06:44, Chung-chieh Shan @.***> wrote:

 Actually I just realized that just translating "functions returning real" results in real-valued variables in the FGG. So instead of adding reals to the language, how about specifying each distribution in the source language as an expression (in terms of the inputs to the factor) that is barely parsed by the compiler?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.