KittyCAD / kcl-experiments

KittyCAD Language
9 stars 0 forks source link

General metadata system #2

Open adamchalmers opened 1 year ago

adamchalmers commented 1 year ago

All KCL objects (lines, paths, points, corners, solids, etc) should support metadata. This way users can:

These metadata need to support both statically-typed metadata (which we typecheck), and flexible data (for external devices or clients, whose schema we won't understand).

Right now the fantasy docs describe a Material type, which all 3D objects take as a parameter when they're constructed. In my opinion this could be a special case of the more general and powerful metadata system I described above.

adamchalmers commented 1 year ago

I see two possible approaches. One copies Attributes from C# and Rust. I'm going to describe the Attributes approach now, and think about the other over lunch.

Attributes

In both C# and Rust you can put attributes on most language items. E.g. in Rust you can put #[doc = "..."] on a type or module, and you can put #[serde(default)] on a field.

Both Rust and C# have a set of built-in attributes (e.g. #[doc] and #[cfg]) but also let you create your own custom attributes, which can be consumed by libraries, e.g. the common serde library uses attributes like #[serde(default)] to customize how fields get serialized into JSON.

Attributes are structured: each attribute defines what schema it follows. E.g. the #[doc = "..."] attribute only accepts a string, and #[serde(...)] accepts a list of specific Serde keywords, like #[serde(default, skip_if_empty, serialize = "my_custom_serialize_function")]. This is great, because we can use structured attributes for:

Eventually other clients could create their own libraries with structured attributes. For example, users might create a library with attributes for shapeways.com 3D printing service. Like #[shapeways(plastic = Shapeways.Plastic.Blend35, color = rgb(20, 200, 33)].

Ideally we'd allow users to put attributes on basically anything. Because KCL is such a simple language, it doesn't have many language constructs -- only functions and expressions. In Rust you cannot currently put an attribute on an expression (see their open RFC for this) -- but you can put them almost anywhere. If you want to put an attribute on an expression, you can just move it into its own function, though, so it's not hard to work around. Hopefully because KCL is simpler, the KCL compiler would support attributes almost anywhere.

adamchalmers commented 1 year ago

tagging @Irev-Dev to read this when he wakes up

adamchalmers commented 1 year ago

Databases

On reflection, our language is really outputting:

Does this kinda sound like a database to anyone else?

Semantics

We could "compile" KCL source code into a database:

If we do this, KCL basically becomes a domain-specific language (DSL) for inserting into a database.

This has some big advantages:

Syntax

See below

Irev-Dev commented 1 year ago

I haven't used attributes much because I've used rust so little, so they don't click with me quiet as much if I had more exposure, would you mind outlining the advantages of attributes over something like a metadata param with key-value pairs?

adamchalmers commented 1 year ago

We totally could use a metadata param, we'd just need to put it in every single function that can create an object. So it would be a somewhat inconvenient user experience.

On the other hand, we could make metadata an implicit first parameter to every function...

Irev-Dev commented 1 year ago

Yeah cool, sorry taking me a bit to wrap my head around things, so if we could use params, but would likely start looking noisy. Maybe a good rubric is that modeling and 3d-geo is a first class citizen, meta-data is an important thing to layer in, so modeling data goes in params, medata is layered on with attributes.

One thing I wasn't sure about either from your the example #[note(audience = "all", text = "this is the main gear shaft", color = rgb(255, 20, 20))], maybe it's just the example you used, but because this is a general human-readable comment, not structured data (like the shapeways example), is there some conflict with /// style comments? If I were to guess how they differ it would be

I had a very tangential idea with attributes, after writing it, it became obvious it should probably be in it's own issue https://github.com/KittyCAD/kcl/issues/3.

adamchalmers commented 1 year ago

For background, I think having structured notes could be useful. You can see in my example I imagine different notes having different intended audiences, so the manufacturers could filter only notes relevant to them, or the designers could filter only notes relevant to them.

The way Rust does it, /// comment is actually shorthand for #[doc = "comment"] -- so here's my proposal. In KCL there are two ways to make a note. These two are equivalent:

  1. Attribute syntax, e.g.

    #[note(audience = "all", text = "this is the main gear shaft", color = rgb(255, 20, 20))]
  2. Docstring syntax, e.g.

    /// this is the main gear shaft
    /// @audience all
    /// @color rgb(255, 20, 20)

The former is nice because you can do it all on one line, and it reuses the very general, flexible attribute syntax. The latter is nice because you can write really long comments that span multiple lines, without needing to enable word wrap in your IDE.

adamchalmers commented 1 year ago

Keyword arguments

(an alternative to attribute syntax)

Motivation

If types like Solid3d and Line are database tables, then to the user, they would probably be structs with

The problem with structs is that you have to initialize all their fields. This would be really inconvenient for users -- you might not care about adding a note to every value, or a surface finish, or whatever. So we should define default values for many (perhaps all) of these fields. If users don't set the value explicitly, we use the default value.

Unfortunately, there's an awkward tension between optional arguments and positional arguments. Because the compiler needs to know which fields are being omitted! Say you have a function call like cube(side_length, material, note, surface, fillet_properties, extra). How do you omit note and fillet_properties (replacing them with a default value) but keep the rest? If you write cube(side_length, material, surface, extra) then the compiler isn't sure which two properties you've omitted.

Python has this problem, but solves it by using keyword arguments (kwargs)

How do kwargs work

For example, the open function in python's standard lib (docs) is defined as

def open(file, mode='r', buffering=- 1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

Here file is a positional argument but the others are all keyword arguments, with their own default values. You call it by saying:

# You have to specify all positional arguments. There's only one here.
# Default values are used for all keyword arguments.
f1 = open("log.txt")
# You can override the defaults for keyword arguments.
f2 = open("log.txt", encoding="UTF-16")

I really like this, because it allows us to:

  1. Define a lot of structured properties without overwhelming the user
  2. Let users rely on our defaults, and opt into just the metadata they care about
  3. Add new metadata fields to KCL without breaking existing source code (i.e. maintain backwards compatibility). If we introduce a new metadata field, we add it as a keyword argument, with a sensible default. This way old programs written before we introduce the field will still compile, using the default

So, returning to the problem above: how do you invoke the cube function without specifying all properties?

// Only specify one metadata item
myCube = cube(Distance::foot(13), material=Aluminium.ISO5052)

// Metadata items can be the output of other functions.
// Here the `note` metadata is the output of a `note` function.
// The `note` function could be from the stdlib or defined by the user.
fancyDie = cube(Distance::inch(1), note=note("this is for D&D games", color=rgb(200,10,20), audience="designer"))

// Same as above, but using `let-in` syntax to make it a bit more readable.
fancyDie = let
  myNote = note("this is for D&D games", color=rgb(200,10,20), audience="designer")
in cube(Distance::inch(1), note=myNote)

// You can always specify 'extra' properties that just go straight into GLTF
d = cube(Distance::inch(1), extra={"cnc_mode": "slice", "cnc_blade_width", 4.2})

When you define a function with kwargs, you specify the default values.

// A special die where the "1" pip is replaced by some custom text. Customers want to personalize their dice.
customDie = (d: Distance, custom_text: Text = "KittyCAD") =>
    cube(d)
    |> emboss_top(custom_text)

Improving ergonomics

Shorthand

I think it'll be pretty common for users to define values and then pass them into keyword arguments like cube(dist, note=note) or cube(dist, material=material). Here the left-hand side is the name of the keyword parameter, and the right-hand side is the value being passed in as an argument. We could make that more concise by letting users just write cube(dist, note=) as a shorthand for cube(dist, note=note). Or maybe they'd write cube(dist, =note), either one works I guess.

Delegating kwargs

I think we'll need a way for users to define their own functions and "delegate" all the kwargs to the inner functions they're calling. For example, say you're dealing with a lot of really big cubes throughout your code. You want to avoid repeating yourself all over the codebase, so you define a function for the big cube.

reallyBigCube = cube(Distance::metre(100))

Later on, you realize that each cube needs to have a different note on it. So you add a kwarg for notes, with a default value. You reuse the standard library's Note.default() as the default value for your notes too.

reallyBigCube = (note: Note = Note.default()) =>
    cube(Distance::metre(100), note=note)

So far so good. But as your program evolves, you'll probably want to add more and more properties. This gets unwieldy pretty quickly:

reallyBigCube = (note: Note = Note.default(), material: Material = Material.default(), surface_finish: SurfaceFinish  = SurfaceFinish.default(), ...) =>
    cube(Distance::metre(100), note=note, material=material, surface_finish=surface_finish, ...)

You can omit the type annotations to make this nicer. KCL can infer the types of all your kwargs here, because it knows the underlying types of the stdlib function cube:

reallyBigCube = (note = Note.default(), material = Material.default(), surface_finish  = SurfaceFinish.default(), ...) =>
    cube(Distance::metre(100), note=note, material=material, surface_finish=surface_finish, ...)

But this is still unwieldy. You'll probably want, at some point, to let users of reallyBigCube add any metadata that the underlying cube supports.

So, we should support a syntax like Python's special **kwargs parameter, which collects all keyword args into a dictionary. That'd make it much more concise:

reallyBigCube = (**kwargs) =>
    cube(Distance::metre(100), kwargs)

This seems much nicer, but I worry that it's hard to compose. What if you have two different shapes? Say you want to put a sphere on top of a cube.

myShape = let
    myCube = cube(Distance::metre(2))
    mySphere = sphere(Distance::metre(0.5)) |> translate(0, 0, 2)
in union(myCube,  mySphere)

Now if you delegate metadata, you've got to send the same metadata to each shape... that's not good...

myShape = (**kwargs) => let
    myCube = cube(Distance::metre(2), **kwargs)
    mySphere = sphere(Distance::metre(0.5) **kwargs) |> translate(0, 0, 2)
in union(myCube,  mySphere)

This locks you into using the same set of metadata for both cube and sphere. We'd need some syntax for overriding that metadata -- unpacking it into a struct, updating some fields, then packing it back together and passing it into the other function calls.

Even worse, how do you delegate kwargs to different underlying types?

myShape = (text, **kwargs) => square(Distance::foot(2)) 
    |> extrude(Distance::foot(10)
    |> emboss(text)

The kind of kwargs that are needed for the square, the extrude and the emboss functions are all likely to be different. So you can't just pass one **kwargs object into all of them.

adamchalmers commented 1 year ago

On the other hand, a function's keyword arguments are basically equivalent to structs. The set of keyword args accepted by a function is equivalent to a struct, with each individual keyword arg corresponds to a field of that struct (the field is optional and has a default value).

So maybe instead of adding kwargs, we should add struct types. Then each function can explicitly declare a metadata struct, with whatever fields make sense for them. If users omit a field, it's set to a default value. This way, we can add new fields in new versions of KCL without breaking users -- we just define a default value.

I like this because users will probably want struct types anyway, and it simplifies how functions work (which simplifies the type system). I can't help but be a little peturbed that no other static, functional languages use keyword args... maybe there's a reason why.

@greg-kcio what do you think?

greg-kcio commented 1 year ago

I like the pythonic style of default args and kwargs (disclaimer: I'm biased bc Python is my daily driver).

When using default arguments in Python, you must declare required arguments before optional arguments:

# this is good
def cube(length: Distance, note: Note=Note.default(), material: Material=Material.default()):
  pass

# valid calls:
cube_01 = cube(Distance(10, 'mm'))
cube_02 = cube(Distance(10, 'mm'), Note("This is Cube 02"))  # kwarg without key, but in order
cube_03 = cube(Distance(10, 'mm'), material=Material("PETG"))  # omit the first kwarg and specify the second
cube_04 = cube(Distance(10, 'mm'), Note("This is Cube 02"), Material("PETG"))  # kwargs ordered, without keys
cube_05 = cube(Distance(10, 'mm'), note=Note("This is Cube 02"), Material("PETG"))  # when ordered, kwarg keys do not need to be specified
cube_06 = cube(Distance(10, 'mm'), Note("This is Cube 02"), material=Material("PETG"))
cube_07 = cube(Distance(10, 'mm'), material=Material("PETG"), note=Note("This is Cube 02"))  # kwargs can be out of order when you specify the key 
cube_08 = cube(material=Material("PETG"), note=Note("This is Cube 02"), length=Distance(10, 'mm'))  # required args can also be out of order when you specify the key
kwargs = {material=Material("PETG"), note=Note("This is Cube 02"), length=Distance(10, 'mm')}
cube_09 = cube(**kwargs)  # this works too and all the same rules apply

# invalid calls:
cube_10 = cube(note=Note("This is Cube 02"), material=Material("PETG"), Distance(10, 'mm'))  # illegal even though we can reason about the args
cube_11 = cube(Distance(10, 'mm'), Material("PETG"), Note("This is Cube 02"))  # kwargs out of order with no keys

# this is illegal because there is a required arg declared after the optional arg
def bad_cube(length: Distance, note: Note=Note.default(), material: Material):
  pass

Personally I prefer leaving no ambiguity for type requirements and would like typed args and kwargs. Python has built-in types that support this (nominally only, there is no runtime type checking this way): TypedDict and @dataclass. Both are essentially structs (nominally!). So whether we call them structs or dataclasses or something else, it would be convenient to support those as kwarg "types" and unroll them into function arguments. Idk why other languages don't have something similar... it is straightforward in Python since the language is dynamically typed and encourages duck typing.

adamchalmers commented 1 year ago

But... if we have structs, then I don't see a need for keyword args anymore. You just have a struct with a field for each kwarg, and if you want them to be optional, you just use Option instead of String. Then your function unwraps the optional with its desired default value.

This solves the composition problem I outlined above. Instead of declaring **kwargs and delegating them to both sphere() and cube() we can just define sphere_args: Option<...> and cube_args:<...>`, or define your own union/intersection of them and pass the values around as you want.

adamchalmers commented 1 year ago

So, I discovered that OCaml has "label args" (docs) which are basically just like keyword args. You can define any argument with a label. You can also define optional parameters. Declaring an optional T parameter is just syntactic sugar for a required Option parameter.

They don't have something like **kwargs. So in my above point where I said

But this is still unwieldy. You'll probably want, at some point, to let users of reallyBigCube add any metadata that the underlying cube supports.

OCaml solves this by saying "No, you can't declare the function as supporting any metadata that the underlying cube supports". I guess that's OK. You just declare the kwargs you want, and if the function signature becomes very long, that's OK.

Reading the OCaml docs for kwargs definitely reduces my worries about including them in the language. I know you can accomplish the same things using structs, but kwargs seems like a better DX.

lf94 commented 1 year ago

It appears ECMA-335 / ECMA-334 attributes are something which are non-extensible by users. This is problematic for users with unforeseen needs.

As proposed by @Irev-Dev , key-value pairs (objects) would be much more ideal, and flexible, and type-safe if users are able to define their own types. Sure, there can be a set of standardized keys and values, but there must be a way to extend this by the users themselves.

Some will want attributes to propagate and others not. The only solution to this is to either have syntax to explicitly propagate or not. The easiest thing is have the user repeat the application of an attribute. The best thing is making it easy to combine attributes.

Named arguments are like @adamchalmers said, essentially structures, but in their defense a bit more user-friendly. I'm sure this particular aspect could be bikeshedded a lot, but we're all aware structures are more widespread than kwargs :slightly_smiling_face:

adamchalmers commented 1 year ago

I agree, I think attributes are the wrong way to solve this. These metadata values need to be first-class concepts in the language, so they have to become return values or arguments to functions.

So far in the language design, I've found places where I'd like keyword arguments (here), and other places where they'd help introduce backwards-compatible API changes. On the other hand, I haven't really found places where users would need to design their own structs. I think because this is going to be such a limited, single-purpose language, structs from the stdlib might be enough, i.e. there may be no need for users to define their own types.

If that's so, then avoiding structs would really simplify the language implementation. It frees us from thinking about

So, for now, keyword arguments seem much simpler to implement and design. I suggest we:

lf94 commented 1 year ago

I haven't really found places where users would need to design their own structs.

It's hard to comment without seeing some potential examples. That would further help everyone else understand the direction of the language. I too rarely, if ever, use a structure that includes things other than measurements and positions. So I completely agree.

I don't think metadata values have to be returned by anything, but user defined function with user defined keyword arguments to generate particular values sound super dang useful (maybe this is what you meant: "metadata [derived] values")!

adamchalmers commented 1 year ago

OK, I feel good about this discussion. Keyword args, no **kwargs syntax, and maybe we'll get to structs down the road. Thanks!

Irev-Dev commented 1 year ago

Apologies for being late to this, I think javascript has some pretty good patterns, with spread operator, rest operator and destructuring

from your example @adamchalmers

myShape = (**kwargs) => let
    myCube = cube(Distance::metre(2), **kwargs)
    mySphere = sphere(Distance::metre(0.5) **kwargs) |> translate(0, 0, 2)
in union(myCube,  mySphere)

If I were to mix in some js syntax here

myShape = ({..args}) => let
    { forCubeOnly, forSphereOnly: radius, ...rest} = args
    myCube = cube({dis: Distance::metre(2), length: forCubeOnly, ...rest})
    mySphere = sphere({dis: Distance::metre(0.5), radius, ...rest) |> translate(0, 0, 2)
in union(myCube,  mySphere)

What happens here is all args are collected in args, then we peel off some keyValue pairs where interested in, forCubeOnly and forSphereOnly but we rename the latter to radius then the rest are thrown into a new bucket rest. When we call my cube, we use a key-value pair for length, but the key and variable name match for radius in the sphere example so the shorthand radius, is fine over radius: radius, in both cases we also dump the rest of the arguments by spreading them into the rest of the object.

Any decent js dev would immediately refactor to the destructuring in the function param definition, definitely cleaner

myShape = ({ forCubeOnly, forSphereOnly: radius, ...rest}) => let
    myCube = cube({dis: Distance::metre(2), length: forCubeOnly, ...rest})
    mySphere = sphere({dis: Distance::metre(0.5), radius, ...rest) |> translate(0, 0, 2)
in union(myCube,  mySphere)

This to me seems like all the benefits of kwarg, with the one exception of ordered params are not part of this, but I actually like this, ordered params are kinda hostile to a new reader

image

source

To hammer this home imagine you're a mechanical engineer and you are very familiar with modelling concepts, every modelling software you've used takes a sketch or similar and extrudes it by some scalar, you're reading someone's example KCL and you see the line myExtrude = extrude(poorlyNameVar, someOtherPoorlyNameVar), your new to programming so your thought isn't immediately "I need to figure out which each of these are" and so you get stumped, vs myExtrude = extrude({sketch: poorlyNameVar, distance: someOtherPoorlyNameVar}) and it clicks with what you already know about CAD modelling.

adamchalmers commented 1 year ago

There are a few programming languages where only keyword arguments are allowed, or at least, the default is keyword arguments. E.g. Swift.

// Declare a function, looks normal.
func greet(person: String) -> String {
    let greeting = "Hello, " + person + "!"
    return greeting
}

// Use a function: note the parameter name on each argument!
print(greet(person: "Anna"))
print(greet(person: "Brian"))

You can also use different labels for the argument (passed by caller) and the parameter (input within the function).

func greet(person: String, from hometown: String) -> String {
    return "Hello \(person)!  Glad you could visit from \(hometown)."
}
print(greet(person: "Bill", from: "Cupertino"))

In this example there's one value, its argument label is from and its parameter label is hometown.

You can opt into positional arguments (i.e. just a parameter with no argument label, see docs) and also give default values for arguments.

I will say that all these features will complicate our compiler, but they definitely will provide a nice flexible UX. I agree that this will make function calls much more readable for new programmers. The circle(400, 300, (200, 100)) readability problem is real!