spy16 / sabre

Sabre is highly customisable, embeddable LISP engine for Go. :computer:
GNU General Public License v3.0
28 stars 5 forks source link

Re-organize single sabre package into core, reader and sabre #25

Closed spy16 closed 4 years ago

spy16 commented 4 years ago

Still work-in-progress. Highlights of the refactor here:

  1. Bulk of the core LISP related functionality now lives in core package.
  2. Reader is in a reader package.
  3. sabre root package provides the same interface for using sabre as before. (i.e., New(), ReadEvalStr, ReadEval, ValueOf etc.)

I was experimenting with the idea discussed in #16 .. But most likely i will go back to the old Value interface itself.

Feel free to provide feedback and design inputs if any.

Upcoming changes:

  1. Move Set, HashMap and Module to sabre package instead of core ? (just a thought.. since these are not really very well implemented at the moment and are not really necessary for pure LISP like usage)
  2. Re-define special-forms system. Expose Analyse as public function to allow adding custom special forms? (I don't want to expose since either the special form is very generic in which case users can raise a PR and we can merge it into Sabre. If not, it probably can be implemented as a macro/function)..
lthibault commented 4 years ago

On a high level, this all looks 👍

I have a few general design notes for the future, which I'll share below. Any specific problems or blocking issues will be mentioned directly in source comments.

Design notes

1. Value interface

I know we've discussed this before, but part of me is tempted to go all-in on the Value interface, with the main goal of reducing type assertions.

The general idea is to define a rich Value interface that contains both required and optional methods. Values that do not implement a given optional method would simply return ErrNotImplemented, allowing the runtime to take appropriate action.

To illustrate, it might look something like this:

var ErrNotImplemented = errors.New("not implemented")

type Value interface {
    Source() string
    Eval(env Env) (Value, error)
    Invoke(env Env, args ...Value) (Value, error)
    Compare(other Value) (bool, error)  # note the added error value
        // ...
}

Obviously this adds a bit of boilerplate when implementing new values, but the advantages are:

  1. Flexibility
  2. Performance (because no reflection)
  3. Readability (because "reflection is never clear" and Value functionality is no longer split between multiple interfaces)

Regarding flexibility, a motivating example might be useful. One thing I would like to do is to automatically cast/convert types during operations on heterogeneous types, or during numerical overflows. For example:

(+ 0.00125 22)
;; prints `22.00125`, a float

I don't believe there's a simple way of doing this right now. An approach I like a lot is to define an Add(other Value) (Value, error) method to Value. The + function would then be implemented as follows:

func Add(v0, v1 Value) (Value, error) {
    return v0.Add(v1)
}

And Float64.Add might be implemented as follows:

func (f Float64) Add(other Value) (Value, error) {
    switch v.(type) {
    case Int64, Float64:
        Float64(f + v), nil
    default:
        // N.B.:  this allows the builtin `Float64` to gracefully handle
        // user-defined types.  Addition is commutative, so we can defer
        // to the other value's implementation of Add.
        other.Add(f)
    }
}

If I someday want to add a BigInt type, I now only need to ensure its Add method (and eventual Multiply, Divide, ShiftLeft and ShiftRight) methods correctly handle builtins.

With regards to performance, such an object model has the potential to greatly improve member access expressions (see: https://github.com/spy16/sabre/issues/12). Instead of relying on reflection, Value can define the following methods:

type Value interface {
        // ...

        // Bind a value to an attribute name
        Bind(name string, v Value) error

        // Resolve an attribute by name
        Resolve(name string, defaultValue Value) (Value, error)

        // ...
}

Now, implementations are free to provide member access with e.g. a map[string]Value, although the Any type can still use reflection under the hood.

BTW: I also feel quite strongly that things like IsHashable are an anti-pattern. How can Sabre users create their own hashable types and use them in maps? I'm planning on building immutable datastructures, so there's no reason a list or map can't be used as a key.

The proposed approach would allow us to specify a Hash() (Value, error) method, such that an implementation can return a hash value that is different from the concrete value.

One last thing: if the Value interface gets too big, we can always fall back on Value.Resolve. For example, a map implementation can could call someKey.Resolve("Hash", nil) to check if the key's type implements a custom hashing method. If it does not, it can attempt to hash the concrete value. This would allow us to have many optional methods in our object model, without cluttering the Value interface, and without the drawbacks of reflection.

We don't have to do any of this now, but I'm interested in keeping this discussion going. I'm more and more convinced some form of object model is a must-have.

2. Core utilities

I would really like to see functions like containerString in some sort of package. Sabre's mission is to provide a batteries-included toolchain for building a lisp dialect in Go. Language internals are part of that contract, so I believe this is in-scope.

Same goes for evalValueList, newEvalErr, getPosition, etc. My project requires a lot of custom types, so these are all very useful.

3. Value.Source() should be replaced with fmt.Stringer

Like you, I'm increasingly convinced that Value should provide its string representation. That said, I think the String() string method is preferable to Source() string for two reasons:

First, fmt.Stringer means we can directly print values using the fmt or log packages:

// good
log.Printf("error encountered in %s at %s%s\n%s", filePath, lineNo, colNo, myValue)

// bad
log.Printf("error encountered in %s at %s%s\n%s", filePath, lineNo, colNo, myValue.Source())

Second, Source() is arguably incorrect. Lisp is often referred to as a "syntax-free" language, since it's primary syntax is the set of data-structures itself, not their homoiconic string representations. This is arguably a detail, but it's nonetheless "more correct" to say "I'm returning a string representation of lisp source" than "I'm returning the source-code of this memory object". In other words, strings of symbols are serializations of lisp source.

4. Set should be implemented with a map.

I suspect there might be an issue with this suggestion, but I still thought I'd bring it up. I think the Set type should be backed by a hash map. Any reason why you used []Value instead of map[Value]struct{}?

5. Container types should be interfaces.

This is how Clojure does it. Using interfaces makes it much easier to implement custom types, which is in keeping with Sabre's positioning as a "build your own lisp toolkit".

I would like to see an abstract interface for List, Vector, Map, Set and Module, along with an unexported, default implementation for each. Notwithstanding the above object-model suggestions, the interface for the current Map implementation would be:

type Map interface {
    Value
    Get(Value) Value
    Set(Value, Value)
    Keys() Values
    Values() Values
}

We could then have default constructors for each interface (perhaps in a builtin package, or something). For Map, it would look something like this:

package builtin

func NewMap() core.Map {
    return hashMap{Data: make(map[core.Value]core.Value)}
}

Note also: I don't think it's appropriate for the exported type to be called HashMap. The Hash prefix is an implementation detail. Users of Sabre may wish to create maps based on trees, or other non-hash-based approaches. Users should care about behavior, not internals. Can we call it Map, instead?

spy16 commented 4 years ago

This is great, Interesting points. Honestly, this PR is in draft mode exactly because i wanted to get the discussion going before finalising all the changes. I might not even go ahead with this PR.

1. Value Interface

go all-in on the Value interface, with the main goal of reducing type assertions.

This is definitely an interesting idea and perhaps we can combine some of the value type variations. But I am not entirely convinced that going all-in is a good idea.

  1. Flexibility:

    • The example (+ 0.00125 22) is already supported. As long as the function parameters have types to which the arguments are assignable (tested using AssignableTo() and ConvertibleTo() reflection functions), function invocation takes care of this automatically. For example this works when you pass int value to a float parameter, or you have a function that takes Go native bool value but you pass core.Bool value etc.

    • It's not possible to have all the possible operations pre-defined on value interface and I feel it wouldn't make sense to force all values to implement all the methods. I count 7 methods from the examples and interaface samples you provided alone. This does not include the Multiply, Divide, ShiftLeft, ShiftRight etc.

    • I feel these kind of "generalised" functions (like Add, Multiply etc.) can be added into a standard-library kind of package with well-defined interfaces (Adder etc). For example, we could have Add function as:

      func Add(v0, v1 Value) (Value, error) {
      // 2 `if` checks assuming commutative property. 
      // but languages that provide ways to overload operators for
      // custom types, do not assume `commutative` property because it
      // can be domain dependent and order dependent. (for example [1,2]+[3,4] is not commutative in python).
      // Also, assume type T1 defines __add__(self, other) method
      // and type T2 doesn't. Then `t1 + t2` is valid but `t2 + t1` is not.
      if adder, ok := v0.(Adder); ok {
       return adder.Add(v1)
      } else if adder, ok := v1.(Adder); ok {
       return adder.Add(v0)
      }
      
      if primitiveTypes(v0, v1) {
       return addPrimitive(v0, v1)
      }
      
      return nil, errors.New("don't know how to add")
      }
      
      type Adder interface {
      Add(other Adder)
      }

      This might look like there are many type assertions here, but if you think of the suggested approach, each Adder implementation will have type switch like the one you have shown for Float64 .

    • I have introduced a memberAccessor optional interface in this code. When a qualified symbol resolved, for example foo.bar.baz, foo is resolved from current scope, then the request to resolve bar.baz is dispathed to the value found for foo if it implements memberAccessor which effectively provides the same flexibility..

    • HashMap , Set etc. are very naive implementations and completely agree that IsHashable is a bad idea. It only exists to prevent panics if you try to write {[1 2 3] "hello"}. I have no objections on having Hash() method as part of Value interface itself.

    • I do not see how someKey.Resolve("Hash", nil) would be better than type assertion (You used reflection to refer to the current approach, but technically current approach is optional interfaces using type assertion).

  2. Performance: Mostly agree on this. Although since everywhere we still have to access values through an indirection (the Value interface), nothing gets inlined. So I suspect the difference will not be much.

  3. Readability:

    • Type assertion is not as bad as reflection (which is what the idiom is about) itself from readbility standpoint. But I agree that no type assertion would still be better in many cases. But on the other hand, using optional type makes it obvious what each Value type is capable of. For example, it is immediately obvious to the reader that Keyword is invokable just by looking at the methods it has. Imagine you are looking at godoc of sabre. If you go down the ErrNotImplemented approach, user needs to actually go into source to figure out if it actually implements it.
    • Also, I do believe that, when size of Value interface grows, there will be multiple redundant methods (the ones that return ErrNotImplemented ) on different value types which will offset any positive impact of removing type assertion.

2. Core Utilities

3. Value.Source() should be replaced

4. Set should be implemented with a map

5. Container types should be interfaces

Ultimately, I see the following ideas as something we should definitely do:

  1. Value interface should have some more methods, but not all of them. Probably something like:

    type Value interface {
    // definitely
    Eval(env Env) (Value, error)
    Hash() uint64 // just an example.
    String() string
    
    // may be. But i still feel since the usage of `Invoke` is limited,
    // it is cleaner keep it optional.
    Invoke(env Env, args ...Value) (Value, error)
    }
  2. Built-in logic around container types like Set, Map etc. should use interfaces than concrete types.

Also, a parallel idea I have been thinking about is a concept of runtime which would allow more flexibility and also enable supporting Persistent data structures. But would require significantly more work and will be complex.

type Runtime interface {
  Env

  // Map, List, Vector etc. need to be interfaces here.
  NewMap(pairs ...KVPair) (Map, error)
  NewList(items ...Value) (List, error)
  NewVector(items ...Value) (Vector, error)
  NewSet(items ...Value) (Set, error)
}
lthibault commented 4 years ago

@spy16 Thanks for the thoughtful reply! I'm glad we're on the same page, and I'm truly grateful that my input is appreciated!

Your points are well-taken, and there were indeed a few misunderstandings on my part, so this has been very helpful and interesting.

I think we're quickly converging on concrete solutions, so hopefully this will be the last of my super-long comments!

1. Value Interface

[...] perhaps we can combine some of the value type variations. But I am not entirely convinced that going all-in is a good idea.

100% agreed. I made the case for "all-in" in my previous post, but it is indeed clear that we should take a more nuanced position. In that spirit, here's the part of my proposal that I still find relevant: a documented, concrete (in code) model of the Sabre runtime and its interaction with values. Note how there are two distinct parts to this:

  1. Documented. Where can I get a "big-picture" view of how the Sabre runtime is organized? This can be as simple as docstrings.
  2. Concrete. The public codebase should consist of concrete artifacts -- structs, interfaces, functions, etc. -- that reflect the important parts of the model.

Here's my reality with Sabre right now:

  1. Documentation: I know what each piece does, but not how all the pieces fit together. That requires careful reading of the source. It's not obvious from the parts. I feel like I'm looking at clay rather than Legos.
  2. Concreteness: Runtime contracts are not explicitly specified (except in rare instances, as with Env). When designing custom values, I'm paralyzed by choice. How do I know if I'm "doing it wrong"? What does "doing it right" look like? Is method (*List).Foo part of the list contract with the runtime? Or is it part of some optional interface? Or is it an implementation detail that my own type doesn't need? How can I plan ahead and build a non-trivial architecture if my assumptions may turn out to be invalidated after a few days of coding?

Am I making sense? I think I mistakenly pitched my initial comment as a technical discussion, when it actually relates to cognition, pedagogy, and community.

The solution, I think, lies in providing some sort of scaffolding to restrict developer freedom just enough to guide their thinking. Assuming we agree on this, then Value is a key part of this runtime model. Getting Value right becomes important, and here are the constraints I see:

To respond more specifically to your comment:

Overall, I would like to suggest the following compromise. What do you think?

type Value interface {
    // String returns an s-expression representing the value
    String() string

    // @spy16 - I can't think of a case where a Value isn't an
    //                  expression.  Am I overlooking something?
    Eval(env Env) (Value, error)

    // Hash returns a hashable representation of the value.
    // Most Values will simply return themselves, but certain
    // types may require custom hashing, or may not support
    // hashing altogether.  In the latter case, they should return
    // a non-nil error.
    Hash() (Value, error)

    /*
        Member access methods.
        @spy16:  I prefer this nomenclature & these function signatures.
                        Thoughts?
    */

    // GetAttr accesses a member attribute.  If the attribute does
    // not exist AND `fallback` is not nil, `fallback` is returned.
    GetAttr(name string, fallback Value) (Value, error)

    // SetAttr assigns a member attribute.
    SetAttr(name string, Value) error
}

N.B.: I agree with keeping Invoke optional, at least for now.

2. Core Utilities

containerString can be exposed i guess.

👍

getPosition i haven't exposed yet, because i was considering removing Position type in favor of more generic metadata property.

👍 That's fair. When a mature solution is released, I think it should be exported.

evalValueList and newEvalErr are already public from core package.

👍

3. Value.Source() should be replaced

👍

4. Set should be implemented with a map

👍

5. Container types should be interfaces

Sorry, but I don't see how it can make it easier to implement custom container types. Could you show me a use case may be?

I was admittedly unclear in my previous comment, but I think we're actually saying (almost) the same thing.

My proposal is basically:

  1. Let's create a Map interface
  2. Let's not export the concrete implementation of Map (i.e.: HashMap).

As before, my reasoning is cognitive, rather than technical. When implementing a custom map, I currently find myself asking questions like "is func (*HashMap) Get(...) part of the map contract?". I need to scour the source code to find the answer.

Also, a parallel idea I have been thinking about is a concept of runtime which would allow more flexibility and also enable supporting Persistent data structures. But would require significantly more work and will be complex.

You've got my attention! 😃 How do you see this working?

Apart from enabling persistent datastructures, I like how this groups most (all?) of the Sabre-isms in one place. It provides much of what I'm asking for, namely:

Honestly, I really like this. Runtime provides a global runtime model, and Value provides a local "object" model. The runtime transforms objects. Everything is clear.

spy16 commented 4 years ago

I've been super busy. I have few things I would like to discuss here. I will get back to you this weekend. :)

lthibault commented 4 years ago

@spy16 No worries, take your time! Happy Friday!

spy16 commented 4 years ago

I see our comments converging which is definitely a good sign. I think we are on the same page for points 2-5. With respect to containers as interface types, I don't have have any objections and I completely agree with your points on contracts not being clear.

Like you said, I think Documented and Concrete are 2 very important things we can use to guide the discussion here and are preceisely the reasons why I am not in favor of GetAttr/SetAttr. Before explaining the reasons, here are some assumptions which I think will be true for most usecases.

  1. From a LISP session, reflection is not entirely avoidable regardless of the approach we choose (Since eval still needs to make sure the args care compatible with the parameters of the value being invoked)
  2. With 1 being true, most usecases of Value-native methods are in Go code (not from a LISP session).

Why not Value-native GetAttr/SetAttr for most things?

  1. Documented

    • With the proposed Value type, all accesses to things like Hash for example are done as GetAttr("Hash", nil) which makes it a string based contract and it's not possible to use any Go documentation ways to document this expectation worsening the problem of good documentation.
    • You can't get a big picture at all since even if some methods are logically part of same contract (For example methods of Seq interface), they would be distributed all over the place because there is NO language backed concrete contract definition.
  2. Concreteness

    • v.GetAttr("Hash", nil)

      • This can return an error or a value. So an if check to ensure it didn't error out is necessary.

      • It is not guaranteed that the returned value here is Invokable. So an extra type assertion/conversion is needed to assert that it can be invoked.

      • Only after the above 2 checks will you be able to do Hash().

      attr, err := v.GetAttr("Hash", nil)
      if err != nil {
       return err
      }
      // attr is guaranteed to be a value, but is it callable?
      if invokable, canInvoke := attr.(Invokable); canInvoke {
       hash, err := invokable(/* what if it expects some arguments */)
       // check this error also
      }
      • We could also make the Invoke() part of Value type which will eliminate the type assertion above, but error check and ambuguity about parameter still remains. It increases ambiguity in other places as well: For example, all primitive types will have to implement Invoke() now. It will not be clear to the user by looking at say GoDoc whether Symbol is really invokable or it's just there to satisfy Value. (Careful examination of the entire codebase is required to understand what is really invokable and what is not). Not only that, what about Next() Seq , we can't move all of them to Value?

      • If you consider the optional interfaces approach, a single type assertion like the one below guarantees everything works as expected and is very explicit to both the user and the compiler (which has more benefits, explained below)

      if h, hashable := v.(Hashable); ok {
       h.Hash()
      } 
      • Also notice how, type assertion cannot be avoided in proposed approach as well and the total overhead for proposed approach would be at least: a function call + an if for err check + a type assertion where as optional interface approach has 1 type assertion
    • Optional interface provides better concreteness in multiple ways:

      1. You can document the interface properly which shows up in GoDoc which adds to the Documentation aspect as well.
      2. While it looks like it we can avoid type assertions by doing GetAttr, we really cannot as explained in the above examples. Optional interface can provide a compiler verified contract definition and ensure it stays working.
      3. We can also leave var _ Seq = (*List)(nil)assertions at the top of files which serves 2 purposes: compiler can ensure the types are guarnteed to implement the contracts and also it clearly explains what the value type is capable of (In this example, it says list is seqable).
    • Performance:

      • I did a trivial benchmark and the difference between Value native GetAttr and access through type assertion turned out to be 7 ns. This value can be significant when lot of such accesses are done, but as Rob Pike says n is usually small. :)
      • Even if n is large, the benchmark didn't account for any implementation detail of GetAttr itself. I can see 2 non-reflection approaches here:
        1. a switch case: probably not that costly upto some point, but can become costly. Besides, it can't provide the dynamic nature you are expecting since it is compile time construct. Also, this approach can be bug-prone (string case etc.)
        2. map[string]Value: not costly considering since ammortized constant time, but cost added by the checks after the GetAttr returns as explained above still remains.

What i am proposing:

With respect to Hash() (Value, error) :

Phew! that's a super long comment. Let me know your thoughts. :)

I am experimenting with a slightly different organization as well (better contract in a core package, actual implementations separated into collection package etc., also interfaces with immutable/persisten collections in mind) and will push it or describe here soon.

I have not really thought through the Runtime approach. If you have any concrete ideas here, do let me know.

lthibault commented 4 years ago

Hello! Let me dive right in, as it feels like we're making serious headway!

Member access semantics

Ok, you've convinced me that Get/SetAttr are neither concrete nor well-documented. In particular, I hadn't considered the need type-assertions after the call to GetAttr.

Another argument to your point: item 5 (containers as interfaces) provides us with much of the concreteness and self-documentation we need.

Regarding performance:

  1. You make a good point that some reflection is inevitable at the Lisp/Go boundary.
  2. I agree that it seems unlikely that we can significantly reduce type assertions through clever interface design (which was my initial hope/belief)
  3. 7ns is indeed much smaller than I expected.

As it sounds like we both agree that performance is important (albeit secondary to other considerations), perhaps we should write a benchmarking suite fairly soon? This way, we can at least guard against significant performance regressions (e.g. an order of magnitude, or more).

Regarding your proposal:

I'm onboard. My observations:

  1. The optional interface amortizes well in cases of repeated attribute lookups on the same object.
  2. The type-assertion penalty will of course amortize poorly in cases of chained lookups (e.g. foo.bar.baz).
  3. Nevertheless
    • Chained lookups will be O(n), and on the order of nanoseconds. Acceptable.
    • n is likely to remain fairly small (i.e. short chains).

Anyway, 👍

Hash()

There is some ambiguity here. If the Hash returns again a Value, how would things depending on a hashable value work?

The idea is that Hash doesn't compute a hash, but instead provides a hashable representation of the value. It is this representation which is then fed to a hash function.

But before going any further, I agree that returning an error is counterproductive. Let's change the signature to Hash() Value.

Now, every type must provide a hashable representation of itself. There are only two possibilities:

  1. Natively-hashable types (e.g. Int64 or String)
    • v.Hash() returns v
  2. Non-hashable types (e.g. Map)
    • v.Hash() returns some guaranteed-hashable value that is, in effect, an encoding of v.

I like this because it means Hash no longer depends on any particular hashing algorithm. It's up to the caller to decide which hashing algorithm is most appropriate depending on the situation (crypto? use sha512. hashmap? use murmur. etc...).

Misc.

I am experimenting with a slightly different organization as well (better contract in a core package, actual implementations separated into collection package etc., also interfaces with immutable/persisten collections in mind) and will push it or describe here soon.

Looking forward to it 😃

I have not really thought through the Runtime approach. If you have any concrete ideas here, do let me know.

If I'm not mistaken, the changes involved should be minimal. The only difference between Runtime and Env is that Runtime also provides factory methods for various types.

What if we instead kept Runtime and Env separate? We would then:

  1. Also pass Runtime as an argument to any function that currently takes Env
  2. Pass Runtime to Reader.

In this way, Reader now knows how to create new values. Same with e.g. Eval. User-defined constructors can be accessed via optional interfaces and type assertion.

Am I overlooking something?

Phew! We're getting there! 😄

spy16 commented 4 years ago

should write a benchmarking suite fairly soon

Agreed.

It's up to the caller to decide which hashing algorithm is most appropriate

Hmm I like this idea. Yea, we can do that. I see the process of hashing working as below:

  1. Something (may be a specific hashmap implementation), calls v.Hash() and gets back hv..
  2. Now it passes hv to a specific hashing function hashActual(hv) and gets back the hash value and uses it to build the hashmap.

One question on hashActual. what do you think should be the behaviour when hv is non-hashable? For example, let's say a map value 'm' ends up returning itself when m.Hash() is called (this would work since Hash() is expected to return a Value and m is one.).. But what does the hashActual do now ?

Looking forward to it 😃

Just to give some overview:

I am currently working on cleaning up the Fn MultiFn and SpecialForm types and the macro system. Do you think it would be better to have these in a package called invoke or something or have it in sabre root package? (There are cyclic dependency issues i need to think about to have separate package, but let me know your preference here)..

Runtime and env separate

Yep, you are right. Also with the above design, I think the Runtime can remain external also (for example, from the reader side, VectorReader is specific to an implementation of Vector and it is free to figure out how to create a vector instance.. but need to see how to handle initialization of vectors from other functions.. )

lthibault commented 4 years ago

what do you think should be the behaviour when hv is non-hashable?

I think hashActual should panic.

The reason is that removing the error out of Hash() and hashActual assumes:

[...] every type must provide a hashable representation of itself [...]

If a developer violates this constraint, then all bets are off. Let it crash.

This is very much a design choice. We are requiring all types to be hashable, but I would cautiously argue that this is a good thing because:

  1. Clojure does it.
  2. It's simple. No value is special.
  3. It's powerful. You can do useful things that are impossible in pure Go.
  4. The inability to hash certain values is (IMHO) a failing of most languages. It's a wart, not a feature.

Re: package organization

core package defines just the core interfaces in a single file (core.go)

👍 Yup. This gives me the "one obvious point of entry" feature.

Also, core does not use complex reflection logic [...]

This feels right. Reflection is fine, but it's scary for users. I agree it should be tucked away as deeply as possible, and be as generic as possible. Ideally, users would never have to look at it.

collection package which defines Vector, HashMap , HashSet [...]

I would prefer to see factory functions like NewVector, NewMap and NewSet, which return the corresponding interface from core. The concrete implementations shouldn't even be exported, IMHO.

(Let me know if you have better package name)

What if we renamed core to runtime and called this package core instead?

I find this more intuitive. The runtime package contains high-level runtime interfaces. core contains core implementations of runtime interfaces. Bonus: runtime.Runtime should live in ... well ... runtime.

To avoid any confusion, the rest of comment does not assume this suggested package structure.

sabre package which brings collection and core together to provide a more feature rich setup.

Agreed. This doubles as a canonical example of how to wire everything together. +1 for such self-documentation.

I have defined mutating operations in interfaces similar to this: Map.Assoc(key, val Value) (Map, error) (notice it returns a Map).

Perfect! I have a bit of experience, and as far as I'm aware, the above return-value is the only requirement.

[...] Fn MultiFn and SpecialForm types and the macro system [...] Do you think it would be better to have these in a package called invoke or something or have it in sabre root package?

Intuitively, I'd look for these in the core package. In my mind they are very much core features.

spy16 commented 4 years ago

What if we renamed core to runtime and called this package core instead

Yea, that makes sense. (runtime.Reader is bit odd now though. not sure if we should move this to sabre altogether. but on the other hand, may be it should be okay considering most usage of reader might be from sabre.NewReader)..

runtime.Runtime should live in ... well ... runtime.

With the new interfaces designed suitably for persistent structures also, i am thinking may be we don't need the fat Runtime interface.. But i like the name 🙂, so we could rename Env into Runtime (it is kind of runtime since Eval also goes through it) . also runtime.New() Runtime reads well.

Intuitively, I'd look for these in the core package. In my mind they are very much core features.

Yes, agreed. ✔️

I would prefer to see factory functions like NewVector, NewMap and NewSet, which return the corresponding interface from core. The concrete implementations shouldn't even be exported, IMHO.

I usually prefer making zero values of types usable (usage would be var hm core.HashMap), just because it feels more natural for data structures (similar to var a int, var b []byte etc).. But in this case I am not entirely sure yet since if we want to keep support for persistent structures, directly doing var hm core.HashMap wouldn't work. So, agreed ✔️

Few doubts:

Do you have some ideas in mind about the Map, Set & Vector interfaces ? Also, do you think we need an interface for List also? How would it be different from Vector? If you do have some ideas, do send the interface definitions.

Another one on collection factories. There are 2 ways a collection type might be constructed.

  1. From the Reader.
  2. From any other Go function call (might be called from the LISP session but ultimately comes down to NewMap somewhere)

Approach 1: Have signature of factories as NewMap(env) and the factory checks if the given env supports env.NewMap or something of the sort and dispatch there (the Runtime interface). If not , fallback to initialising the simple hash-map implementations.. Problem with this approach is that the reader will have to know about the runtime.

Approach 2: we could change the Env interface to Runtime and make it mandatory for runtimes to implement NewMap, NewVector and NewSet

Approach 3: For now we just have core.NewMap(), core.NewSet()etc. which always init mutable implementations. When we want to support persistent structures, we add persistent package and define package global primordial values (which can be private) for map, set etc. and persistent.NewMap etc, use those to build from (this is what Clojure does i believe).. Once built we will always do map.assoc which returns a new map and so on.

I like approach 3 the most since it doesn't complicate reader, doesn't introduce more interfaces or more type assertions to check what factory methods the env supports and also doesn't make it mandatory for implementing all factories.

lthibault commented 4 years ago

Yea, that makes sense. (runtime.Reader is bit odd now though. not sure if we should move this to sabre altogether.

I'm in favor of putting Reader in the sabre package.

Intuitively, I would expect Reader not to live in the same package as Runtime. The former deals with building an AST, while the latter deals with how an AST is evaluated. The former deals with syntax, the latter with semantics.

This dissociation isn't 100% perfect, but I expect a Reader to turn character strings into a runtime. I don't expect a runtime have any knowledge about a reader/de-serializer.

But i like the name 🙂, so we could rename Env into Runtime (it is kind of runtime since Eval also goes through it) . also runtime.New() Runtime reads well.

You're not alone - "runtime" has a nice ring to it! ✔️

Do you have some ideas in mind about the Map, Set & Vector interfaces ?

In general? Or are you concerned about a specific issue?

Reading between the lines, it seems like you're noticing that Vector and List are very similar. The major difference between a list and a vector is in their intended uses:

In both cases, the core implementation will probably use a slice, but a persistent implementation will likely prefer to implement List with a persistent linked-list. For this reason, I think we should:

  1. define List
  2. differentiate the List and Vector interfaces to reflect their intended usage

As such, I would cautiously suggest restricting what can be done with a list. In particular, disallowing indexed lookups.

What do you think of this, as a starting-point for discussion?

type List interface{
    Value
    Seq

    Head() Value
    Tail() Seq
    Pop() (Value, Seq)
    Append(...Value) Seq
}

type Vector interface {
    List

    Get(int) Value
    Slice(start, stop, step) Vector
    Insert(int, Value) Vector
}

Another one on collection factories.

I also like Approach 3. +1 for simple factory functions. ✔️

I think hashActual should panic.

You didn't mention anything about this -- does it seem reasonable to you?

spy16 commented 4 years ago

Oh yea, I agree about the panic. I had the same thing in mind. As long as the contract is explicit, implementations violating the constraint is definitely a cause for panic.

spy16 commented 4 years ago

Reading between the lines, it seems like you're noticing that Vector and List are very similar

Well, I knew the difference :smirk: .. Until haflway through this discussion, I was not considering persistent structures in the design, so List and Vector themselves were the implementation details. In the mutable world, List usually means a Linked-List (Fast pop/append at the head) and Vector means a dynamic array (mutable, generally simply a wrapper around Go slice, fast iteration and fast index lookup).. So I was not really sure if we needed an interface there. But considering now we have 2 possible variations for them (mutable + immutable), I just mentioned it now.

List

My confusion was simply because the Seq interface is exactly what defines the List.

  1. Head is simply First()
  2. Tail is simply Next()
  3. Append is Cons (or Conj for multiple)
  4. Pop : In Clojure, pop is same as Next for Lists (Warn: Only w.r.t the return value). A general pop method (remove first item and return it) doesn't make lot of sense for persistent structures. I see that you have handled this case by returning both the popped value and the new seq, but same effect can be obtained using First() (Idea is to peek, same as in Clojure) and then use Next() (similar to Clojure pop).. So not sure if we should extend the contract to include this.,

So may be we are better off using Seq as the expected type wherever we are expecting a List (like Clojure does with ISeq)? And Seq is defined as:

type Seq interface {
  Value

  Cons(v Value) Seq
  Conj(vals ...Value) Seq
  First() Value
  Next() Seq
  Count() int
}

// See Map, Set & Vector for usage.
type Seqable interface {
  Value
  Seq() Seq // can even be a lazy sequence for persistent 
}

Vector

Vector is probably straightforward since we can consider the index lookup requirement. (Clojure uses IAssociative for this. And for maps also, but they extend this with Keys() and Values() if I remember correctly). The idea is, Vector is keyed collection where the Key is the index. For maps, ....obvious.

May be we can borrow things from IAssociative as well?

type Vector interface {
  Value

  Seq() Seq
  Get(key int) (Value, error)         // or EntryAt(), error to handle index out of range
  Conj(items ...Value) Vector           // or Append()
  Assoc(key int, val Value) (Vector, error) // error to handle index-out-of-range
}

Why not reuse Seq? : If we re-use Seq, when user does vec.Conj(someValue) to add a value to a vector, the returned value is Seq which is hard to deal with if the use-case requires return value also as vector (which is the case most of the times.. this is easy in Java since it supports Covariance)

Why not embed Seq and also define vector specific Cons/Conj?: Simply to reduce the required contract size (while not compromising usability though)

Map

Coming back to this: every type must provide a hashable representation of itself, I see a problem here. Some values, for example a Go function implementing value interface or the wrapper type for arbitrary Go values (Any) have nothing that they can use to create hashable value. Initially I was thinking may be we could use memory address of the value or something extracted by reflect as last resort. But after some trial-and-errors I don't see anything that we can use here. Let me know if you have alternate approaches here. (While I really really want to make all values hashable, I definitely don't want to force users to implement Value types manually whenever they want to use Sabre as just a scripting layer by exposing already existing Go values - this is one of the usecases I built parens for in the first place.)

type Map interface {
  Value

  // A single:
  Seq() Seq // sequence of key-value pairs. (slightly implicit)
  // OR following 2 methods:
  Keys() Seq
  Vals() Seq

  Get(key Value) Value      // or EntryAt(). return error? default arg? 
  HasKey(key Value) bool   // return error?
  Assoc(key, val Value) (Map, error)
  Dissoc(key Value) (Map, error)
}

Set

type Set interface {
  Value

  Seq() Seq
  HasKey(key Value) bool
  Conj(items ...Value) (Set, error) 
  Disj(item Value) (Set, error)
}

These interafaces I think are concise while providing enough to build on. Let me know if I have missed any cases or you have other thoughts. :)

lthibault commented 4 years ago

Well, I knew the difference 😏

Oh sorry! I really didn't mean to imply that you didn't!

I was merely going through the exercise of contrasting list/vector in order to set the stage for discussion, but I see how my previous comment might have sounded a bit condescending ... 😬

So may be we are better off using Seq as the expected type wherever we are expecting a List (like Clojure does with ISeq)?

Yes, I meant to ask about Seq (and then forgot) -- thanks for clearing this up. I prefer your approach, and agree with all sub-points. ✔️

Coming back to this: every type must provide a hashable representation of itself, I see a problem

Hmm this is a good point. I also agree that we should relax this constraint if it means sacrificing on the ease of Go-Sabre interop.

Initially I was thinking may be we could use memory address of the value or something extracted by reflect as last resort. But after some trial-and-errors I don't see anything that we can use here.

It seems we can hash reflect.Value - would that not work?

List/Vector/Map/Set

Map

type Map interface {
  // ...

  /*
    @spy16:  Can we have both?  I like the idea of having Map provide Sequable,
                     and I also like having access to just the keys or just the value.

                     This is comparable to dicts in Python, which have a `keys`, `values`
                     and `items` method -- very convenient!
  */
  Seq() Seq // sequence of key-value pairs. (slightly implicit)
  Keys() Seq
  Vals() Seq

  // ... 
}

Cheers 👋

spy16 commented 4 years ago

I really didn't mean to imply that you didn't!

Oh don't worry about it. I knew that's not what you implied. :smile:

It seems we can hash reflect.Value -

Need to check this out. :thinking:

And yes, we could have both Seq() and Keys()+Vals() as well. (I was just concerned about edge cases here. Seq needs to return sequence of pairs and we can't ensure that here. Let me know if you think this is worth the risk of implicit contract).

This is the last of my questions and will probably close this PR since the thread has become really long and hard to follow :sweat_smile:. will raise a separate PR with runtime changes.

lthibault commented 4 years ago

Let me know if you think this is worth the risk of implicit contract

I think it's fine. I often see docstrings in the standard library that place additional constraints on typed values (io.Reader comes readily to mind).

This is the last of my questions and will probably close this PR since the thread has become really long and hard to follow 😅

No further questions on my end as well. This design is turning out nicely! 😃

What do you think are the next steps?

spy16 commented 4 years ago

Indeed, it is turning out nicely. Parallely I've been making changes as per the discussion here. Will soon send a PR. I'll close this and any further discussion we can do on the other PR.