Closed axch closed 8 years ago
I think I proposed backticks in the past and @riastradh-probcomp expressed distaste for non-nestable delimiters.
If we want to borrow from other languages, apparently Ruby has this strange notation: https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#The_.25_Notation
which might suggest %q{expr}
-> quote(expr)
and %qq{expr}
-> quasiquote(expr)
. I feel like this basically amounts to defining shorter names for quote and quasiquote though.
Another idea, I'm not sure how well this would play with the VentureScript grammar, but just use single quote and backtick as they are used in Scheme, and have that swallow the smallest complete next expression that follows, using parentheses when necessary to force a particular interpretation.
A third idea: '[...]
/[...]` or `'{...}`/
{...}` (bleh, getting that to render right in github was perhaps an object lesson against non-nestable delimiters)
I believe Haskell, the other syntax-challenged language I know that has this feature for structures rather than just strings, uses [|...|]
for quasiquote and doesn't have non-quasi quote. Standard unquote is either $(expr)
or $identifier
. (Apparently they also have [name-of-parser-function|...|]
for introducing what in Scheme would be thought of as reader extensions.)
We could also define '
, ,
, and `` as prefix operators with some particular precedence other than "tightest". In particular, it might make sense to have them bind less tightly than function application, and possibly also than arithmetic. If they bind loosely, though, the way to force scope becomes to put parens around the quote on the outside, as
(..... (, ..... ) .... )
, which would look quite odd to a Schemer.
Also, using comma for unquote would be somewhat non-traditional in a syntax that's supposed to be Javascript-like.
I like the Haskell solution, and I also think it's important long term for us to embrace reader extensions now (and demo them, which we should talk about).
I think it is optional for the PPAML PI meeting deliverable but high priority otherwise.
For the sake of engineering sanity, I hereby declare that user-authored reader extensions can wait until #80 is done.
By the way, E has expression quasiquotation in an infix, Algol-style syntax:
This is very important for publishable VentureScript code. The unquote
keyword in the code is distracting.
I like the solution that bears resemblance to bash's $()
. Julia also uses this notation for string interpolation.
This is relevant to the NIPS May 20 deadline.
Actually, upon second thought, it is difficult to expunge the quasiquote
from user code in my framework. I think a solution for both quasiquote
and unquote
is needed.
Personally, I like the following solution too:
[|..|]
for quasiquote
$(..)
for unquote
.There is a related of programmatically forming symbols for use in the modeling environment. For example, in my current framework the user specifies a model program using a block of assumes, and a definition for expressions which will be observed:
// USER CODE (make_model bundles together these into a dict)
define x_coords = array(-2, -1, 0, 1, 2);
define model_program = make_model(
// assumes
do(
assume(a, normal(0, 2)),
assume(b, normal(0, 2)),
assume(line, proc(x) { a + b * x} })),
// observed expressions
proc(t) {
x = lookup(x_coords, t);
quasiquote(normal(line(unquote(x)), 1))
},
// number of observations
size(x_coords)
);
Behind the scenes the observed expressions get assumed
when necessary:
// NON-USER CODE
define do_assume_observations = proc(model) {
obs_expressions = lookup(model, "observed_expressions");
num_observes = lookup(model, "num_observes");
obs_symbols = proc(t) { make_symbol("obs", t) };
mapM(
proc(t) {
assume(
unquote(obs_symbols(t)),
unquote(obs_expressions(t)))
},
arange(num_observes))
};
Currently I do this using a foreign inference SP make_symbol
with the following type signature: [t.SymbolType(), t.NumberType()], t.SymbolType()))
.
An interesting feature that would simplify the process of programmatically generating such symbols, would be to permit:
assume(obs_$t, ..)
Any token in a model expression that includes $(..) is interpreted as building up a symbol using evaluations in the inference environment.
With proposed syntactic sugar for quasiquote and unquote - a huge improvement
// USER CODE
...
// observation model
proc(t) {
x = lookup(x_coords, t);
[| (normal(line($x), 1) |]
},
...
// NON-USER CODE
assume(unquote(expressions(t))
With the ability to build up modeling environment symbols using the unquote syntax - the user no longer needs to use quasiquote at all, because they've bypassed the need to pass modeling expressions around:
// USER CODE
...
// observation model (user code)
proc(t) {
x = lookup(x_coords, t);
assume(obs_$t, normal(line($x), 1))
},
This may be a special case, but seems related to this ticket. A similar feature for building observation labels programmatically using $
would be useful. However, I'm not sure where the boundaries between quasiquote/unquote
and string interpolation lie.
What syntactic sugar, if any, do we want to add to VentureScript for non-quasi quote? We are rapidly running out of nestable delimiters: round brackets are for function application, curlies are for code blocks, square brackets are conventional array literal syntax, we are already proposing Oxford brackets ([| |]
) for quasiquotation.
Why does VentureScript even need non-quasi quote? So that quasiquoted expressions can effectively emit list and array literals:
[| lookup(quote(unquote(map(f, lst))), 1) |] // expands to lookup(quote(1(2, 3)), 1)
The quote is necessary there because an unquoted list would be interpreted as expression structure and evaluated: lookup(1(2, 3), 1)
, which is no good.
In explicitly parenthesized languages, this is not a serious problem: since all expressions are guaranteed to nest anyway, the single tokens backtick, quote, comma solve this problem. Since VentureScript is implicitly parenthesized, though, we need nestable delimiters for all these things.
None of the infix languages I have available as models have syntactic sugar for non-quasi quotation, because their abstract syntax trees are not lists, so they do not expect users to have this problem very much.
I don't want to use nested quasiquotation for this, because the standard semantics for quasiquote nested inside quasiquote is to require one more level of unquotation to actually evaluate, so that quasiquote may be used to programmatically construct quasiquotations.
Proposal: Do not add syntactic sugar for quote
as such. In cases that do not involve nested unquotation, can use quasiquote instead. In cases that do, cover common scenarios by adding syntactic sugar for the quote(unquote(...))
construct. Concretely:
[| |]
for quasiquote.${ }
for quote(unquote(...))
.$E{ }
for non-quoted unquote(...)
, and, for symmetry, also accept dollar literal interpolation $L{ }
for quote(unquote(...))
.quasiquote
, quote
, and unquote
(which are also usable as they are where syntactic sugar is deemed undesirable).This would produce, for the motivating case,
[| lookup(${map(f, lst)}, 2) |] // lookup(quote(1(2, 3)), 2)
Alternatives considered:
quote
(L for Literal)[| lookup([L| ${map(f, lst)} |], 2) |] // lookup(quote(1(2, 3)), 2)
unquote
, requiring literal dollar interpolation for quote(unquote(...))
[| lookup($L{map(f, lst)}, 2) |] // lookup(quote(1(2, 3)), 2)
The current propsal is favored due to optimizing for the case expected to be common for relatively novice Venture users, namely interpolating computed data literals (which should be quoted) into programmatically constructed expressions (which are to be evaluated in the model).
Is there a convention for quote and quasiquote?