Why typeclasses? - Githubissues

shelby3 commented 7 years ago

Is less verbosity the only main benefit of typeclasses over just inputting a set of functions?

Typeclasses can do where bar is a member of Function:

foo(x: A) where Function<A>
   bar(x)

Without typeclasses we could do:

foo(x: A, y: Function<A>)
   y.bar(x)

I understand @keean proposed other functionality for the where clause, but as for the main functionality of providing pluggable (i.e. dependency injection) interfaces, what is the big benefit that justifies creating a new language?

keean commented 7 years ago

I think the question is why not just pass all the interfaces as arguments? You can of course do this, after all type classes are implemented as implicit dictionary arguments.

The advantage of type classes is that they are implicit, hence you have shorter more readable function signatures. They can also be inferred, so you don't have to explicitly type the constraints. You don't want to have to explicitly thread y through every function call where you pass type A because you might want to call bar at some point. Further when you realise you need to call bar on a value of type A you don't want to have to refactor every call in the program to pass an implementation.

shelby3 commented 7 years ago

Thanks for the reply. I appreciate those advantages. I am just thinking about near-term priorities. I think achieving the "perfect" language is a major endeavor that needs a significant amount of funding and time. I've been thinking that your goal is maximum genericity (i.e. the ability to write some code that can parametrized over many types and uses cases) and minimum explicitness (although in other cases you argue for more explicitness where I argue for less).

My near-term goal is not that. That is far too much for me to think about and digest at this point. I am wanting some improvements on JavaScript so I can go use the same code base both on the client and the (Node.js) server. For example a cryptographic library.

promises (ES6 has this)
anonymous unions (TypeScript has this)
guards or match (TypeScript has this)
integer types (u8, u16, u32, 64)
structures (handles all the boilerplate on top of TypedArrays and ArrayBuffers)
everything as an expression
Python-like code block indenting instead of curly brace delimited blocks

I've been thinking about achieving a much simpler transpiler to TypeScript that can meet those near-term objectives.

Later can experiment with moving to higher levels of genericity and less explicitness, if warranted by the use cases that are prominent. And with more funding and momentum behind the initial simpler transpiler.

Further when you realise you need to call bar on a value of type A you don't want to have to refactor every call in the program to pass an implementation.

That works if you only allow every data type to implemented only one way for each typeclasses, but as we discussed in the past (with @skaller) that has limitations also. Otherwise, you could get unintended implicit functionality. So explicitly passing has its benefits.

Edit: Dec. 4, 2017

integrated low-level C-like coding with the high-level language and memory allocation mgmt (aka GC)

@shelby3 wrote:

@shelby3 wrote:

Perhaps we will eventually think of some low-complexity abstraction that gives us non-marshaled integration for tree data structures between HLL and LLL, and perhaps we can even prove borrowing and lifetimes statically. But I suspect we can not without heading towards the complexity of C++ and Rust.

@keean wrote:

I would not write a web service in 'C' or even 'Rust' ever, this is really ideal ground for a garbage collected language. I would use Node (TypeScript), Django(Python) or 'Go'.

@keean wrote:

My view is there needs to be a high-level-language and a low-level-language that share enough memory-layout and data-structure that data can easily be passed backward and forward.

@shelby3 wrote:

I am wanting to move away from such clusterfucks of never ending aggregation of complexity.

@shelby3 wrote:

My quest in @keean’s repository has to been to get the advantage of GC and perhaps some other aspects of FP such as first-class functions and closures in a simple language that also can do some low-level things such as unboxed data structures. To further erode the use cases where C/C++ would be chosen, to relegate that clusterfucked C++ language tsuris (and perhaps Rust also but the jury is still out on that) more and more to the extreme fringe of use cases. For example, It is ridiculous that I must employ a cumbersome Node.js Buffer library to build an unboxed data structure.

@shelby3 wrote:

[…] so that code which would otherwise be written in for example TypeScript and C/Emscripten would not have to be rewritten later in Lucid. More optimum compiled targets could be developed such as WebAssembly which leverage the full power of the design laid out above, including more optimum compilation of the C-like code. One of my test use cases would be to translate my Blake C code to Lucid.

keean commented 7 years ago

Regarding one implementation per class, that is the advantage of type classes :-) If you want more you need to pass an object or function (because you want the implementation to depend on the value of the object not the type).

To me it seems you can do nearly everything you want in typescript, so why not use it? I am working on a project in typescript at the moment, and for the ES6 stuff plus types it's good enough.

Regarding goals, I agree maximum genericity is my goal, and minimum boilerplate. However I also want to prioritise readability and understandability over conciseness, as you spend more time debugging and maintaining code than we do writing it. However I am happy for explicitness to be optional, which includes type-inference (type annotations optional) typeclass constraint inference (so constraints/bounds are optional). The only place we need types would be modules, where we want a stable interface that enables separate compilation.

shelby3 commented 7 years ago

Regarding one implementation per class, that is the advantage of type classes :-) If you want more you need to pass an object or function (because you want the implementation to depend on the value of the object not the type).

If one universal implementation for each universal class was an absolute advantage or even feasible, then we wouldn't need scoping for (type) name conflicts in programming. It is impossible to guarantee one global implementation for the universe, because total orders don't exist.

I am not arguing that it can not be a feature that is desirable in some cases in a partial order.

To me it seems you can do nearly everything you want in typescript, so why not use it?

I guess you didn't read my list above, or didn't realize that the items I listed which TypeScript doesn't do are extremely important for the work I need to do.

However I also want to prioritise readability and understandability over conciseness

Ditto myself. I take this even further than you as I am thinking all types shall be explicitly declared, except in an assignment where either the destination or source is explicit, but not required for both. Since passing function arguments is assignment, then anonymous lambda functions can be implicitly typed by the function's argument type.

keean commented 7 years ago

I thought typescript included ES6 features, and has integer types as well as strucures (classes). The only one it does not have is "everything is an expression" I think, so it seems close to what you want?

Regarding type classes, it is clear that each function can only have one implementation for any given set of argument types. In other words we can only define one implementation for (+)<Int, Int>

shelby3 commented 7 years ago

Regarding type classes, it is clear that each function can only have one implementation for any given set of argument types.

Disagree (and my reason was already stated). But I don't want to repeat that argument now. I have said all I want to say about it for now. It isn't a priority issue for me at the moment. We can revisit that later if ever get to the point where I want to implement a "typeclass" language.

Edit: we had a long discussion about this in another thread in 2016.

shelby3 commented 7 years ago

I thought typescript included ES6 features, and has integer types as well as strucures (classes).

Afaik, there are no integer types in ES6. Are you thinking of 64-bit math methods in ESNEXT?

http://kangax.github.io/compat-table/esnext/

Interfaces and classes do not pack into ArrayBuffers. Perhaps you were thinking of the following proposal which has gained no traction:

http://wiki.ecmascript.org/doku.php?id=harmony:typed_objects

In 2016, I wrote some very complex (untested) JavaScript code to improve on that concept and provide a polyfill. But it has some drawbacks. I need to revisit that code to refresh my memory.

shelby3 commented 7 years ago

I looked again at Nim, but it is getting too far from JavaScript (wants to be a systems language competing with C, C++, Rust, and Go). For example, lose JavaScript's Promises, ArrayBuffers, etc..

And it doesn't offer 64-bit integer support on JavaScript.

And I'd much prefer the JavaScript output is recognizeable, so the JavaScript debugger can be effectively used.

shelby3 commented 7 years ago

I think I envision a robust way to transpile from a syntax and type system I want into comprehensible TypeScript. For example, I can use TypeScript's getters and setters to access the fields by name of a DataView on an ArrayBuffer. When not packed into a binary ArrayBuffer, 64-bit integers can be represented with TypeScript's tuples.

keean commented 7 years ago

It may be slower than normal JS objects due to packing issues, or it may be faster due to non-string property lookup. It certainly seems the right way to process binary network data, but I would probably use native JS objects everywhere memory layout is not important.

shelby3 commented 7 years ago

but I would probably use native JS objects everywhere memory layout is not important.

Agreed. I didn't intend to imply otherwise. Not only network data, but file data. You know I am working a blockchain design.

Edit: Also memory data. With a blockchain, you need the UXTO stored in RAM and this becomes huge.

keean commented 7 years ago

nodejs can mmap files from JavaScipt which would work well with this too. You don't have too many choices in-browser, you are limited to Blobs with a streaming interface, and the file IO API is not supported by all browsers.

shelby3 commented 7 years ago

nodejs can mmap files from JavaScipt which would work well with this too.

Right agreed. Interacting with binary data in JavaScript on Node.js is via very verbose APIs. I'd prefer a language native solution.

shelby3 commented 7 years ago

You don't have too many choices in-browser

I wrote "client side" but that doesn't mean we are limited to the browser. I am thinking Android and iOS.

shelby3 commented 7 years ago

Most in the blockchain realm use a systems programming language with good binary control, such as C or C++. But this is painful in other aspects. There is a lot of verbosity that isn't always needed. Need complex libraries such as Boost for asynchrony, etc... Do not have GC by default, etc.. I think it is overkill.

I am searching for a better balance. Rapid coding with less needless verbosity, yet still be able to handle binary data and integer math without verbose APIs. And also some (explicit) typing for correctness checks and better readability (open source).

shelby3 commented 7 years ago

Others have integrated C/C++ with JavaScript via Emscripten separating out code that requires that control from that can be done more elegantly in JavaScript, but this IMO isn't ideal.

Most code doesn't need to be that heavily optimized (or at least not the first priority). I am thinking there should be something in between a toy language such as JavaScript and a full blown systems language such as C++.

keean commented 7 years ago

node-webkit could be an option then (npm nw). You can still compile typescript ( for example: https://basarat.gitbooks.io/typescript/content/docs/quick/nodejs.html )

shelby3 commented 7 years ago

node-webkit could be an option then (npm nw).

Good point and agreed if Node.js APIs are needed client-side.

keean commented 7 years ago

Well if you want to read and write binary files, its pretty much required. In a browser, FileReader ( http://caniuse.com/#search=FileReader ) is well supported, but FileWriter is not ( http://caniuse.com/#search=FileWriter ).

shelby3 commented 7 years ago

Well if you want to read and write binary files, its pretty much required.

Obviously Node.js isn't the only way to read and write files on Android and iOS (and even can access it via JavaScript in a WebBrowser instance). Whether one wants to carry all the baggage of node-webkit just for file APIs is not a certainty. But yeah, it is apparently one of the options.

shelby3 commented 7 years ago

My focus in on the notion that we shouldn't be forced to use a systems language just to do integer operations and packed binary structures. That we should be able to combine these capabilities with a rapid-coding paradigm that has GC, simple-as-pie asynchronous programming, and less optimum but less painful ways to for example manipulate strings (because we don't always need that to be highly optimized).

C++ is too tedious and overly detailed when in most cases we don't need that. Yet we shouldn't have to revert to C/C++ just to get integer operations and packed binary structures.

Premature optimization often means projects that don't get finished.

shelby3 commented 7 years ago

Yeah performance is battery life and that is important on mobile. But projects that get to market are more important than projects which don't.

keean commented 7 years ago

For Android and iOS there are things like Cordova, is that the sort of thing you are thinking about?

shelby3 commented 7 years ago

Yeah there are also Nativescript and Tabrisjs.

But I am not sure if I choose those, because may also want to think about desktop apps as well.

This is a wide open field of experimentation, because we also have convergence of the laptop/desktop and mobile on the horizon perhaps...

keean commented 7 years ago

Using HTML5, JavaScipt and IndexedDB you can already write software that runs across mobile, pad, laptop, desktop, and big-TVs. It works pretty well, except for annoying corner cases, like iOS limiting IndexedDB to 50MB of storage. It's definitely fast enough for a lot of applications, and HTML/CSS makes good looking UI reasonably easy.

shelby3 commented 7 years ago

Writing to the least common denominator controlled by gatekeepers who squat on and stall innovation seems to me to be a losing paradigm. What about SD cards, cameras, various multi-touch swipe gestures, etc... Consumers care about features not about how you achieved it.

keean commented 7 years ago

Well you have to write your own "drivers" for each platform. On Android you have to write in Java to call all the native APIs, but you can call Java you have written from JavaScript if you create an android application shell in Java. Same for iOS (9+) where you can call native APIs from objective-c or Swift, and you can call your own objective-c or Swift code from JavaScript, so you would need a different objective-c or Swift wrapper app. These wrapper apps need to get into the respective app-stores, which means getting Google/Apple to approve your code. You can distribute outside of the app store on both devices, but it requires the user to jump through hoops (android you need to enable a device setting to allow non-app-store apps to install, iOS you need to "trust" the developer of each non-app-store app in the device settings).

shelby3 commented 7 years ago

Well you have to write your own "drivers" for each platform. On Android you have to write in Java to call all the native APIs, but you can call Java you have written from JavaScript if you create an android application shell in Java. Same for iOS (9+) where you can call native APIs from objective-c or Swift, and you can call your own objective-c or Swift code from JavaScript, so you would need a different objective-c or Swift wrapper app.

Agreed and I was fully aware of that. But good you've summarized for readers.

These wrapper apps need to get into the respective app-stores, which means getting Google/Apple to approve your code. You can distribute outside of the app store on both devices, but it requires the user to jump through hoops

These companies are trying to create walled gardens and be gatekeepers. Me thinks the free market is going to route around these Coasian costs. They are trying to carefully position themselves with "safety" and "convenience" to acquire monopoly control.

In fact, that is one of my other longer-term goals.

If you have an open source app which acts as a base for all other apps and doesn't abuse as a gatekeeper, then it could alleviate this problem. Once everyone has that base app installed, then it takes care of overriding all these toll bridges. Toll because the costs are paid but indirectly.

(android you need to enable a device setting to allow non-app-store apps to install, iOS you need to "trust" the developer of each non-app-store app in the device settings).

Clicking Ok for a confirmation dialog is not problem for users. It is when they make the enabling action modal and hidden, that the achieve insidious control.

Popping up a red alert, "many other users have reported issues with this app" (with a link off to details) would be a sufficient free market solution to safety.

Even Google realizes that ads can't sustain the company. So they are going to have to become monopolistic. Open source should slay this, as it did to Microsoft Windows.

Android defeated iOS in global installed units (not dollar weighted) marketshare because it was more open. Ditto to Android if they make it difficult for users to have freedom.

As I assume you know, Android still lags (or perhaps near parity with) iOS on a dollar weighted basis (such as $ spent on apps), because iOS simply worked better. For example the egregious latency with Android audio was unacceptable for some scenarios and many people aren't even aware of issues like this. Apple's cathedral-style engineering integration was superior, but eventually bazaar-style open source catches up. Takes a while.

shelby3 commented 7 years ago

@keean I am not actively coding again yet (only the 19th day of my 8 week intensive 4-drug phase of TB treatment)

I find this comment to be relevant:

Maybe what Rust really needs is fewer programmers writing Rust crates and features, and more programmers who use those features to write end-user programs that won’t be redistributed.

This.

The Rust community looks, from the outside, a bit ingrown and subject to a kind of “can you top this (complexity)” syndrome. I don’t mean this observation in a hostile way; if I were eyeball-deep in the development of a young language I would probably behave the same way – and arguably I did, back in Python’s earlier days when I was briefly on that devteam.

But yes, I think more feedback from people who just want to get stuff done and have a lower complexity tolerance would be good for the language.

Complexity is even more insidious than that, I’m sad to say. It’s not enough to avoid being excessively clever. Even if you restrain yourself, complexity will still creep on you anyway. You need to actively fight it whenever your can, and periodically review your “complexity budget.”

Rust has already, in my opinion as a production user, overdrawn the complexity budget in one place. This happened accidentally, very early on. Specifically, it’s the interaction between several Rust features: (1) The auto-deref features provided by Deref. (2) The split between “owned” types like String and reference/slice types like &str. (3) The unification-based type inference that allows you to write let i: u32 = "10".parse()? and automatically choose the right version of parse. (4) The Into trait (and related traits), which allows you to write functions that take arguments of type Into<String>, which essentially means, “Oh, just give me any type that advertises conversion to a String.”

Any one of these features individually makes Rust much nicer to program in. But if several of them all gang up in the same piece of code, a novice Rust user will get burnt. If I’m reading between the lines correctly, this might actually be what bit you with bind. The bind function takes an argument of type ToSocketAddrs, which is basically one of those Into-like traits which allow you to pass in several different ways of representing a socket address. It’s cute and clever (yuck), and it works most of the time. But if it gets combined with (1-3) above, it’s going to generate crappy error messages. The fix is relatively simple for an experienced Rust programmer, if annoying: Add explicit type declarations all over the place. But at work, my guideline is “just don’t use Into-style conversion traits in your APIs unless you have a damn good reason. They don’t actually improve API ergonomics.”

If this is the only place where Rust exceeds it’s complexity budget, well, users will just learn this one bad interaction and go about their lives. C++ has dozens of warts worse than this. But any further expansions of Rust in this area need to be handled very carefully, and production Rust users need to participate in the RFC process.

But let’s dig deeper.

There are two libraries in the Rust space which worry me: Diesel and Tokio. Diesel looks like an ORM, but it’s really not—it’s just a typed version of the relational algebra which can dump output into Rust data types. It results in really concise and efficient code once it’s working. But the error messages are just horrendous (though not yet in modern C++ territory, though that’s nothing to be proud of). Diesel has chosen to push Rust’s type system to its limits in the name of speed and expressiveness. I’m not sure it’s worth it. We had a long debate at work and I paired on Diesel code with one of our other Ruby backend guys, and he said the tradeoffs with Diesel’s error messages were worth it. I’m not 100% sold.

Where I’m more concerned is tokio. As everybody has told you, tokio is central to the Rust async I/O story, and lots of popular crates are moving to tokio-based backends (though most will still export synchronous APIs). And from what I’m hearing, tokio is currently generating bad error messages for some common use cases. In my opinion, this needs to be fixed—and the core team is discussing pushing up a language feature that’s been in the RFC process for a while now, which will hopefully make the error messages much clearer.

Still, I’m holding off on tokio-based async I/O for at least 6 months in production code, to see how this all plays out.

And the comments complaining about the JavaScript ecosystem chaos such as this one:

The situation with JavaScript and Node.js is more complex. At the time of writing, there are 399,773 packages on npmjs.com. And yet, there are really obvious categories with no good alternatives. For example, I needed a CLI option parser the other day, and I needed it to support git-style subcommands. I searched npmjs.com for 30 minutes and spoke to our in-house Node guru, and everybody said, “No, every single Node package for your use case is terrible.” This makes me sad and frustrated. I eventually found something that was, frankly, pretty bad and cudgeled it until it worked.

shelby3 commented 7 years ago

And as I had pointed out on the Rust forum back in 2016, all the complexity doesn't really provide us anything that really helps us that much because most programming errors are not constrained to the sort of safety Rust is checking for (yet those safety checks are very debilitating to code in), as reiterated by this comment:

What you described is probably true for a lot of user-space code, but it is not true for the kernel. If you look at Linux kernel CVEs, you will see that they have nothing to do buffer overflows, stack smashes, use-after-free vulnerabilities. They are more likely to be logical errors, which can be exploited under some condition. Even when you hear a data race condition found in the kernel, it is unlikely to be caused by a missing lock. In most cases, it is a logical error of some kind. For example, here is the patch for a recently discovered data race in Linux (CVE-2016-5195): https://lkml.org/lkml/2016/10/19/860. As you see, it was not about locking, but about proper checking different flags.

Moreover, Linux developers actively use a static analyzer to catch the most common mistakes in their code. So Rust can’t help there much. Also, it is completely unrealistic to write a general purpose kernel without unsafe code. Even if you look at the Rust standard library, you will find a lot of unsafe code there (e.g. all collections). When you work with hardware at low-level, you have to do a lot of things that can be potentially unsafe. But let’s suppose that it was possible to write the whole kernel in a safe programming language, would it eliminate security exploits? Of course, not.

You may look at long list of CVEs for many web-frameworks written in safe languages (such as PHP, Python, Ruby). By its nature, the kernel works at a higher privilege than the user-code, which means many logical errors in the kernel can be exploited to do something that the user should not be allowed to do. So writing a safe kernel is a far more complex task then writing a kernel in a safe language. First of all, you have to have a formal definition of what you mean by “safe” and then decide how you are going to prove that. It is very tricky even for a very small toy kernel.

shelby3 commented 7 years ago

@keean you might find it useful to read these perspectives:

https://blog.ntpsec.org/2017/01/18/rust-vs-go.html

https://blog.ntpsec.org/2017/02/07/grappling-with-go.html

shelby3 commented 7 years ago

I think I could do a transpiler that doesn't even need to compute the types, if I am willing to use special infix operators for the 64-bit integer types, e.g. '+' instead of + because Typescript doesn't have operator overloading. I will make the 64-bit integer types interfaces. Ditto what I stated about employing getters and setters for modeled structured data. Then let Typescript do all the type checking.

But I will probably need a type checker to do everything well, including ideas about abstracting concurrency and parallelism boilerplate from the other thread.

keean commented 7 years ago

The transpiler does not need to check types, but the idea of checking types is to not accept any input that will break the transpiler. In other words the types are used to validate the assumptions on which the transpilation works.

So if you make no additional assumptions over type script you can do without type checking, however the error messages would relate to the target generated code, not the source code. It is probably better for the transpiler to type check and give error message referencing the source code line and position, rather than letting the program transpiler, and then have typescript throw an error referring to a line and code the author has never seen because it's the output of transpilation.

shelby3 commented 7 years ago

I completely agree it is better to type check in the transpiler. As I said, would also enable the use of + instead of some hokey '+' for an overloaded infix operators on 64-bit integer types.However, the '+' could be replaced with + in the future with a simple search and replace. And agree, it would enable better error messages.

Note I am also factoring in that I want to be coding with this on a blockchain project within a month, if possible. So I don't think that is enough time to build a type checker. Also I have a lot of commitments from the community to fund my blockchain project if I can deliver it asap. So then from that funding, we could possibly fund the development of the type checker and possibly even the other features we've been considering.

Tangentially, I recently asked Eric S. Raymond (the creator of the term "open source") if he was interested in paid work improving the JavaScript ecosystem and he replied it might be interesting. He seems to eat architectural complexity with ease.

I think perhaps an incremental approach is most realistic. But perhaps I am mistaken. Perhaps I should just start coding in Typescript instead.

shelby3 commented 7 years ago

@keean wrote:

Is less verbosity the only main benefit of typeclasses over just inputting a set of functions? ... I understand @keean proposed other functionality for the where clause, but as for the main functionality of providing pluggable (i.e. dependency injection) interfaces, what is the big benefit that justifies creating a new language?

I think the question is why not just pass all the interfaces as arguments? You can of course do this, after all type classes are implemented as implicit dictionary arguments.

The advantage of type classes is that they are implicit, hence you have shorter more readable function signatures. They can also be inferred, so you don't have to explicitly type the constraints. You don't want to have to explicitly thread y through every function call where you pass type A because you might want to call bar at some point. Further when you realise you need to call bar on a value of type A you don't want to have to refactor every call in the program to pass an implementation.

Another benefit is that instead of the function taking an input argument reference to the interface, the function can be optionally monomorphised to an optimized variant of the function for each interface dependence injection.

So instead of the caller creating an object which has the structure of the interface and passing it as an argument, the compiler can create a variant of the called function which calls the interface methods directly, i.e. monomorphized.

You may have intended to imply this where you wrote ‘implicit’. I am stating the benefit more explicitly.

P.S. I am conflicted whether to use British or American spelling, such as the ‘s’ or ‘z’ in monomorphised.

shelby3 commented 7 years ago

@shelby3 wrote:

Optional implicit parameters (which Scala has) provide a way of unifying runtime interfaces (passed by value) and monomorphisable interfaces (passed implicitly by type).

I like that summary (in the context of what I had written). We should make sure we frame that and use it in our documentation.

@shelby3 wrote:

Regarding type classes, it is clear that each function can only have one implementation for any given set of argument types.

Disagree (and my reason was already stated). But I don't want to repeat that argument now. I have said all I want to say about it for now. It isn't a priority issue for me at the moment. We can revisit that later if ever get to the point where I want to implement a "typeclass" language.

Edit: we had a long discussion about this in another thread in 2016.

https://www.reddit.com/r/programming/comments/1ptiy2/martin_odersky_the_trouble_with_types/cd6dni7/

shelby3 commented 7 years ago

@keean wrote on Jan 31:

It may be slower than normal JS objects due to packing issues, or it may be faster due to non-string property lookup. It certainly seems the right way to process binary network data, but I would probably use native JS objects everywhere memory layout is not important.

I just wrote:

Of course this means extra overhead when for example adding and removing elements from an array, except for the binary packed structures (i.e. stored by value in place instead of by reference) I proposed (and note that store by value is better for cache locality performance).

shelby3 commented 7 years ago

@keean wrote:

Is less verbosity the only main benefit of typeclasses over just inputting a set of functions? ... I understand @keean proposed other functionality for the where clause, but as for the main functionality of providing pluggable (i.e. dependency injection) interfaces, what is the big benefit that justifies creating a new language?

I think the question is why not just pass all the interfaces as arguments? You can of course do this, after all type classes are implemented as implicit dictionary arguments.

The advantage of type classes is that they are implicit, hence you have shorter more readable function signatures.

The function definitions aren’t shorter, because the required interfaces would be in the where clause.

The function call site would be more concise but lack local reasoning.

They can also be inferred, so you don't have to explicitly type the constraints.

I have explained that I want to move away from principal typings inference.

You don't want to have to explicitly thread y through every function call where you pass type A because you might want to call bar at some point. Further when you realise you need to call bar on a value of type A you don't want to have to refactor every call in the program to pass an implementation.

Correct and agreed. Even if not inferring the where clause, explicitness would require refactoring all the call sites turtles all the way up.

But maybe that is a good thing. It is more explicit when changes are made, which call sites are affected. Otherwise this is obscured from version control systems by the implicit capability of the compiler.

And the function definition sites do need be refactored any way, unless there where were somehow inferred.

@shelby3 wrote:

Another benefit is that instead of the function taking an input argument reference to the interface, the function can be optionally monomorphised to an optimized variant of the function for each interface dependence injection.

So instead of the caller creating an object which has the structure of the interface and passing it as an argument, the compiler can create a variant of the called function which calls the interface methods directly, i.e. monomorphized.

On further thought, the smart compiler could still make the monomorphized optimization with explicit interface object arguments, especially if the construction of object arguments is done inline in the function call and the objects are never assigned to any reference then the compiler.

Also monomorphization is not my near-term nor any where near my highest priority.

shelby3 commented 7 years ago

Another advantage of explicitly passing the interfaces instead of the implicit action of typeclasses, is that it will eliminate the syntax (and thus eliminated the obtuse implicit action of the) proposed forall B (aka for<B>) syntax:

So instead of needing to understand how for<B> implicitly selects the interfaces (either at the call site scope or in the scope of the function being called), the required interfaces will be explicitly written in the function definition signatures and the call site arguments. IMO, this will improve the readability by not having obtuse, implicit action and instead explicitly written action. Thus afaics, it would eliminate that issue of whether to use first-class polymorphism.

Edit: HRT have been clarified and apparently the forall<B> annotation can be inferred.

keean commented 7 years ago

I abandoned Ada because it took the interfaces (modules as ML calls them), all the explicit instantiations were just too much boilerplate for me. Have a look at the 'Go' board with monte-carlo simulation in Ada on my GitHub. I would not want to create another language like that.

On a side note, the Rust guys have realised the ad-hoc approach to type system design is getting them in trouble, and have started implementing a logic engine (Chalk) to help them tidy and understand it. This is one of the ideas central to my design is to start from a clean logic, so you avoid all the mess.

shelby3 commented 7 years ago

I can’t read Ada, so it is meaningless to me when I looked at your Ada code.

If we have special syntax for interface objects being explicitly passed in, then the IDE could optionally hide these on toggle.

Again my point about not obscuring changes from version control systems may be an important one. With implicit selection of interfaces at the call sites (with typeclasses instead of explicit passed interface objects), a change which causes cascaded changes turtles up through the call hierarchy, will not be seen by the version control system at the call sites even though the function definitions have changed. Although some might argue this improves the version control reasoning as the changesets are only at the S.P.O.T. of the function definitions (on the where clauses) instead of also all the “noise” of the matching explicit call site changes, the fact is those call sites have changed and those files must be recompiled. Implicit changes are obscuring information. I am contemplating that the main priority of version control is not brevity but rather completeness and transparency of changes. We are aiming for correctness of understanding of changes here, not reducing the number of files with changes.

With explicitness, if some call site was not updated, there would be a compiler error. This forces the programmer to manually contemplate the effect of these changes on the call site logic. Whereas with implicits, things change that programmer is not forced to approve and check. Implicit changes seem more dangerous.

I haven’t decided. I am brainstorming.

shelby3 commented 7 years ago

@shelby3 wrote:

The function definitions aren’t shorter, because the required interfaces would be in the where clause.

The function call site would be more concise but lack local reasoning.

You don't want to have to explicitly thread y through every function call where you pass type A because you might want to call bar at some point. Further when you realise you need to call bar on a value of type A you don't want to have to refactor every call in the program to pass an implementation.

Correct and agreed. Even if not inferring the where clause, explicitness would require refactoring all the call sites turtles all the way up.

But maybe that is a good thing. It is more explicit when changes are made, which call sites are affected. Otherwise this is obscured from version control systems by the implicit capability of the compiler.

I am thinking there is no conceptual difference between the syntax of the required interfaces as typeclass bounds in where clauses versus explicit arguments that have the type of type parametrised interfaces with static methods. Note TypeScript’s handling of static methods fits perfectly (and note the restriction that static methods may not employ the class type parameters is irrelevant because we can parametrise the interface ClockConstructor<A> for the static methods).

The programmer can explicitly supply these arguments at the call site or the compiler the can implicitly supply them at the call site. The former provides a way to get rolling quicker with an initially simpler compiler and that code would not be incompatible with an improvement in the compiler to optionally to do the latter. The compiler can detect these arguments are optionally implicit because the interfaces are parametrised only by type parameters which are quantified at the call site.

It will be a boilerplate pita to always manually (explicitly) supply the interfaces at the call site, especially when the data type is a union and thus the interface has to be constructed as type-case dispatch for each interface defined for each data type in the union.

The above proposal addresses @skaller’s criticism about importance of local reasoning:

Shelby3's chosing by selecting instances by providing candidates in a particular context, and different instances in other contexts mixes both models, so the selection is partly implicit, and partly explicit. This is a bad idea because it makes it hard to reason about what is going on: when you look at a call site, you also have to look at the top of the file to see which instances are made available. The ability to reason about code is dependent on local knowledge.

keean commented 7 years ago

The problem with putting the interface where the type should go, is that you lose information, consider:

f : A -> A where Window[A]

vs

f : Window -> Window

The first is a function that is passed a type 'A' and returns a type 'A' where 'A' is a Window. The second is a function that is passed some unknown type that is a Window, and returns another possibly different type which is also a Window. Consider:

let x : SquareWindow = new SquareWindow()
let y = f(x)

With the first signature we know 'y' has the type 'SquareWindow', but with the second we do not know the type of 'y' only that it implements the Window interface, in other words we have turned 'y' into an existential type (or a trait-object as Rust calls it). Once something is an existential (trait-object) there is no way for the type system to ever recover the static type.

shelby3 commented 7 years ago

f : Window -> Window

I am not proposing that syntax. Instead it would be:

f : A → Window[A] → A

The syntax you presumed is incongruent with what I wrote as follows:

… versus explicit arguments that have the type of type parametrised interfaces with static methods. Note TypeScript’s handling of static methods fits perfectly.

The compiler can detect these arguments are optionally implicit because the interfaces are parametrised only by type parameters which are quantified at the call site.

P.S. I understand you were presuming the incorrect syntax because it was formerly a proposal of mine from a long time ago before I started this thread.

keean commented 7 years ago

I guess I don't understand this:

f : A → Window[A] → A

because a one argument function has turned into a two argument function?

shelby3 commented 7 years ago

because a one argument function has turned into a two argument function?

With an implicit 2nd argument, meaning the compiler can supply it if the call site does not. Scala’s implicits are an example of this.

I wrote before:

The programmer can explicitly supply these arguments at the call site or the compiler the can implicitly supply them at the call site. The former provides a way to get rolling quicker with an initially simpler compiler and that code would not be incompatible with an improvement in the compiler to optionally to do the latter. The compiler can detect these arguments are optionally implicit because the interfaces are parametrised only by type parameters which are quantified at the call site.

keean commented 7 years ago

Don't implicit arguments have to come first? Normally they have a special annotation like:

f : ?Window[A] -> A -> A

Actually i'm not totally opposed to implicit arguments, as long as their scoping is clear.

shelby3 commented 7 years ago

Don't implicit arguments have to come first?

In Scala no, because afair due to curried partial application you need to have quantified the type parameters first.

as long as their scoping is clear

That is one of the big issues with implicits is they are too implicit (which was one of my arguments upthread for being explicit). But that is also the case for typeclass bounds unless one total order for all open modularity is presumed (which I argued is not possible but let us not re-argue that now). In any case, it should not be any worse. And I am proposing the caller can optionally be explicit when they want to.

I am not proposing an implicit keyword such as in Scala, but rather limiting them to this one use case which I posit can be inferred without a keyword. Point being not to create a proliferation of use cases of implicits and that implicit scoping confusion.

keean commented 7 years ago

I don't think implicits are a bad solution, but I prefer type classes because they are simpler, and express the requirements of a function in a clear way. This is also why I prefer 'requires' as a keyword rather than 'where'.

Implicits have the advantage that a single definition can be used in two different ways, which increases generality, however I don't think its useful generality. If type classes represent interfaces, you define functions and algorithms in terms of an interface, why do I need anything more? For example, if I specify in my function requirements that certain objects must be 'openable' then they must support the 'open' interface and I can call 'open' on those objects. Why would I ever need to explicitly pass a dictionary?

shelby3 commented 7 years ago

@keean regarding your prior comment post in this thread, I finally figured out the holistic meaning, context, and implications of what you were writing last year.

^{This exemplifies how much special effort and care must be put into communication, else the reader’s eyes may gloss over due to technobabble and they may really have not comprehended the writer’s point. Communication needs to be not worded in a way that would make sense to someone who is not inside the writer’s head. For example, yesterday I employed private messaging with you to figure out what you meant by “inhabited types” in the context you employed (it required considering if the function was pure then the callback was useless— a detail which you did not mention). When I re-read my own prose, I often find similar lapses in clear communication. Even when we write down many details as you did last year, we fail communicate the key essence of the point cogently for the bewildered reader.}

Apparently you intend to convey that you think if an algorithm is reduced to its algebraic essence, then it is fully generalized without any need for overloading typeclass instances (i.e. without multiple different implementations of the same typeclass for the same data type), i.e. you wish you have a total ordering for typeclass implementations. Any generality that can not be expressed in the typeclass total ordering, is separated into non-typeclasses (i.e. function arguments) dependency injection (aka callbacks). In your sort example, I remember your algorithm input a relation function for ordering the sort instead of a monolithic sort method for a typeclass. In other words, you choose to never employ typeclasses wherein a data type could be implemented in more than one way for a typeclass. It is a (probably advantageous) design choice you make to separate the concerns of typeclasses, which in your criteria should (be a total ordering, i.e.) only ever be implemented one way for each data type, from more generalized dependence injection. I have read some snippets of Alexander Stepanov books and I want to eventually buy them and read them entirely, as they seem to be very astute and an interesting mathematical perspective.

So with that new understanding, I agree with you that explicitly providing typeclasses enables the programmer to violate the total ordering design criteria for typeclasses. However, as I had explained to you in the past, with open modularity there is no way to enforce a total ordering on implementations for data types of a typeclass. For example, imagine a request made over the Internet wherein the data types and dictionaries are dynamically injected by the remote host making the request (and dynamic open modularity is why I am trying to design a language such that script injection attacks are not possible). It’s impossible to guarantee statically (compiler nor linker) that the remote host is employing the same implementation (for a specific data type and typeclass) as any other remote host. A dynamic check could be made based on a cryptographic hash of a canonical implementation, but this is useless because since remote hosts can inject new data types of their own, then it is ambiguous which implementation of data type for a typeclass is the canonical reference implementation. This is why total orderings do not exist in unbounded non-determinism (i.e. do not exist in our universe). Since I have been working on blockchain consensus research for the past years, I want to help others understand the relevance of relativistic concepts in the new world order of decentralized programming and computer systems

Therefor, an explicit option for supplying typeclass instances does not introduce any new loopholes that do not already exists with open modularity.

And as I already pointed out, the explicit option enables a way to code with typeclasses without needing a special compiler, yet the compiler can still be upgraded to optional implicit functionality later remaining compatible with the legacy code. And the explicit option is available for cases where for version control system justifications, the programmer wishes to be more explicit.

Explicitness also provides one way to work around the problem with two typeclasses containing a function with the same name. This could of course also be handled in other ways such as allowing a renaming syntax in a where clause and/or a name scoping prefix (e.g. MyTypeClass::function(…)).

P.S. regarding trust in running external code:

code might be sandboxed and only the remote host’s data (on the server or blockchain) is impacted.
code might have been run externally and only a proof of execution along with state changes is provided, e.g. Intel trusted code initiative or cryptographic methods of proving execution.
remote host may have been issued an API key for reputation and malicious activity tracking.

keean / zenscript

Why typeclasses? #30