Allow type descriptor to determine construction of value from RawTemplate

jclark commented 2 years ago

It would be convenient to have a mechanism that would allow a type descriptor T determine how a value of value of type T (or T|error) is constructed from some subtype of raw template.

One approach is to associated a function with the type descriptor. For example,

T v = check `xyzzy`;

the function associated via this mechanism with the type descriptor T would be called to convert from the RawTemplate produced by parsing xyzzy into a value of type T.

If we can do this for user-defined types, we should also be able to do it for built-in types, e.g.

xml f = `<doc>hello</doc>`;

and as part of the solution to #1098, e.g.

data:Timestamp t = `2022-06-26T16:07`;

In these cases, we would be expecting errors from the parsing of the raw template to be caught at compile-time.

A variation on this approach is to associate the type descriptor with a class that is a subtype of object:RawTemplate: a method on that class would then construct the value.

jclark commented 2 years ago

Here's an outline of a design.

When a module m defines some type T, it can also define an associated raw template class using the following syntax:

readonly class T`` {
  *object:RawTemplate;
  function parse() returns T|error {
     // ...
  }
}

(XXX Why isn't object:RawTemplate readonly?)

This raw template class cannot have an init method. To avoid ambiguity, the intersection of m:T and object:RawTemplate must be empty.

When an expression

m:T`xyzzy`

it evaluated and m:T has an associated raw template class, then an instance of the raw template class is initialized from the result of doing the level 1 parse of xyzzy. Then the parse method on this instance is called.

A statement:

m:T v = `xyzzy`;

is treated as meaning:

m:T v = m:T`xyzzy`;

In other words, we use the contextually expected type to default the type name in a raw-template-expr.

This implies that our model for backtick syntax is that the tag before the backticks identifies a type. This works for the existing string and xml tags, since they are also the names of types, but doesn't work for the existing base16 and base64 tags, so they will need to be fitted into this somehow.

Edited. Not clear to me that this interacts coherently with type definitions and unions. This is not the same as record defaults where the default is associated with an atomic type descriptor.

jclark commented 2 years ago

We can make this work straightforwardly for some cases. If contextually expected type is T, then

if T is subtype of xml, use tag of xml
if T is subtype of string, use tag string
if T is subtype of string-formatted data with tag t, use tag t

This is using just the type (the set of shapes denoted by the type descriptor).

jclark commented 2 years ago

Another problem with extending this to arbitrary user-defined types in arbitrary modules is that if an expression

`xyzzy`

in some module m1 implicitly calls some function f defined in module m2 and m1 has not imported m2, then there will be a problem in correctly computing the dependencies of m1.

jclark commented 2 years ago

I think in an expression

tag`xyzzy`

when tag is unqualified it is better to think of tag as referring to a module and as being short for

tag:fromRawTemplate(E)

where E is a raw template object constructed from xyzzy.

Then

y:T v = `xyzzy`;

would turn into

y:T v = y`xyzzy`;

and then into

y:T v = y:fromRawTemplate(E);

This requires making precise the idea of a type descriptor being associated with a module. This needs to work for something like:

y:T|error v = check `xyzzy`;

One problem is that if module y in fact defines T as

type T z:T;

then one would really want to look up fromRawTemplate in z rather than y. But if we do this we potentially have a problem with module dependencies. So maybe a better solution is to say that y should create an alias for fromRawTemplate along with the alias for the type:

type T z:T;
const fromRawTemplate = z:fromRawTemplate;

(Not sure if the spec allows this to be used for creating aliases for functions, but I think it should.) This would mean that the module(s) for a type descriptor come from the prefixes that are in the type descriptor without resolving any references.

jclark commented 2 years ago

We already use the concept of going from a type to a module when interpreting method call syntax for non-object types. So a solution to this problem for raw templates would potentially allow method call syntax to be transformed into function calls in user-provided modules rather than just for langlib modules, as now.

Edited. I think this is probably not a good direction. With method call, we are dealing with an expression before . . The type system will give us a type for this (rather than a type descriptor). So a feature that could handle this would need to work purely in terms of types. But with this feature we need it to work in terms of type descriptors (like record default values). This is because we want a contextually expected type descriptor of e.g. yaml:Yaml, which might be an equivalent type to json, to do a lookup of fromRawTemplate in the yaml module.

ballerina-platform / ballerina-spec

Allow type descriptor to determine construction of value from RawTemplate #1131