Fast track transpiling to TypeScript?

keean / zenscript

A trait based language that compiles to JavaScript

MIT License

42 stars 7 forks source link

Fast track transpiling to TypeScript? #13

Open shelby3 opened 8 years ago

shelby3 commented 8 years ago

Building off the reasoning and justification in my self-rejected issue Fast track transpiling to PureScript?, I have a new idea of how we might be able to attain some of the main features proposed for ZenScript, without builiding a type checker by transpiling to TypeScript.

If we can for the time being while using this hack, presume that no modules will employ differing implementations of a specific data type for any specific typeclass, i.e. that all implementations of each data type are the same globally for each typeclass implemented (which we can check at run-time and throw an exception otherwise), then the module can at load/import insure that all implementations it employs are set on the prototype chain of all the respective classes' construction functions. In other words, my original point was that JavaScript has global interface injection (a form of monkey patching) via the prototype chain of the construction function, and @svieira pointed out the potential for global naming (implementation) conflicts.

So the rest of the hack I have in mind, is that in the emitted TypeScript we declare typeclasses as interfaces and in each module we declare the implemented data types as classes with all the implemented interfaces in the hierarchy. So these classes then have the proper type where ever they are stated nominally in the module. We compile the modules separately in TypeScript, thus each module can have differing declarations of the same class (because there is no type checking linker), so that every module will type check independently and the global prototype chain is assured to contain the interfaces that the TypeScript type system checks.

So each function argument that has a typeclass bound in our syntax, will have the corresponding interface type in the emitted TypeScript code. Ditto typeclass objects will simply be an interface type.

This appears to be a clever way of hacking through the type system to get the type checking we want along with the ability to have modules add implementations to existing data types with our typeclass syntax. And this hack requires no type checking in ZenScript. We need only a simple transformation for emitting from the AST.

As for the first-class inferred structural unions, TypeScript already has them, so no type checking we need to do.

It can't support my complex solution to the Expression Problem, but that is okay for starting point hack.

I think this is way we can have a working language in a matter of weeks, if we can agree?

That will give us valuable experimentation feedback, while we can work on our own type checker and fully functional compiler.

TypeScript's bivariance unsoundness should be avoided, since ZenScript semantics is to not allow implicit subsumption of typeclass bounds, but this won't be checked so it is possible that bivariance unsoundness could creep in if we allow typeclasses to extend other typeclasses. Seems those bivariance cases don't impact my hack negatively though:

When comparing the types of function parameters, assignment succeeds if either the source parameter is assignable to the target parameter, or vice versa.

Not a problem because of course an interface argument type can never to be assigned to any subclass.

When comparing functions for compatibility, optional and required parameters are interchangeable. Extra optional parameters of the source type are not an error, and optional parameters of the target type without corresponding parameters in the target type are not an error.

When a function has a rest parameter, it is treated as if it were an infinite series of optional parameters.

We should be able to design around this.

Also a compile flag can turn off TypeScript's unsound treatment of the any type.

TypeScript is structural but there is a hack to force it to emulate nominal typing. We could consider instead transpiling to N4JS which has nominal typing and soundness, but it is not as mature in other areas.

shelby3 commented 7 years ago

@keean wrote:

A tag specifically has to be a value to keep everything in the 'simple types' corner of the lambda cube.

Once again prioritizing global inference (which requires wrapping and boilerplate), which is what all the research you cite seems to be hold as a high priority and @skaller and I do not.

(Edit: for readers let me explain that by not allowing the wrapping constructor types (e.g. Left and Right) to be used as first-class types, i.e. restricting them to being values of the single type (e.g. Either), this keeps the design in the origin on the Lambda Cube which is apparently necessary in order to attain principal typings global inference)

I don't understand how you can have any discussions with other computer scientists if you don't keep to agreed definitions.

I suppose you will accuse @skaller of not being a computer scientist as well, even though he is your elder and was on the C++ design committee.

You do not seem to grasp that some schools of thought may prioritize some facets, but that does not make the other schools of thought wrong. Novelty and diversity are a positive thing.

So note in the formal definition e, x and y are values, and there is no possibility they can be types.

Again using types as the tag does not create an untagged union. You are simply citing custom and not logic. Appeals to custom (as an authority) are for barbaric tribes, not logicians.

And I even showed how to use the types to wrap and create a custom first-class name, although I pointed out that is not desired because we are favoring typeclasses.

You seem to have the quality of the German hacker archetype as my many experiences with you are you do not handle novel and out-of-the-box thinking well and always derail the entire discussion when I introduce strange new ideas which differ from your rigid customs:

Eric S. Raymond wrote:

Good at planned architecture too, but doesn’t deal with novelty well and is easily disoriented by rapidly changing requirements. Rude when cornered.

Although you possess a huge amount of expertise, it is very difficult to justify the interaction due to the stress that it creates and massive time loss on dealing with your pedantic nature and inability to engage with positive vigor on novel ideas and support the goals.

I am just telling you publicly why I am probably going to go it alone now. I like your expertise, but with recovering from my horrific 6 year battle with disseminated Tuberculosis and all that, I really do not need this stress.

But I do not harbor any bad feelings nor judgments. I wish you always the best. And I will recognize all your important contributions in our past interactions.

shelby3 commented 7 years ago

So now you’ve finally told me why your cohorts choose to force wrapping. It is once again that global inference prioritization. That seems to infect everything. We already had this discussion about FCP. And when I mentioned FCP at the start of this discussion, you tried to slander me about that too. And yet I was 100% spot on in relating to FCP because all FCP does is force boilerplate in order to get global inference.

Edit: it also applies to local inference too because when the values only populate a single type, then we can infer the type from the assignment of a value. But we are creating contrived wrapping names in order to be able to not write the type signature in some cases.

let x = Some(value)

Writing the type signature is documentation and improves readability.

let x: Maybe = Some(value)

With my idea:

let x: Maybe = value

And is necessary any way in some cases with your way or my way:

let x: Maybe[String] = Nothing

keean commented 7 years ago

I am showing you the definition of a sum type, how can you argue with the definition? What you are taking about are type-indexed-coproducts not sum types.

Now by arguing about the definition, which is well established, you are missing the opportunity to discuss the usefulness of type-indexed-coproducts which we wrote about in the HList paper.

shelby3 commented 7 years ago

I am showing you the definition of a sum type, how can you argue with the definition?

Language is a living phenomenon that changes to match logical usage by those who use it.
Your definition was some pigeon-holed thing by some academics who prioritized global inference, but does not encompass the generality of a co-product nor the terms tagged and union.
My proposal meets the definition of a co-product and the logic of the terms tagged and union.
I pondering if you were spanked too much in a very strict school in your youth. And now you obey very well. And now you want to spank those who do not do what they are told, even if it what I am doing is entirely logical.

And more importantly, there was very simple way you could have phrased your objection which would have not derailed into ad hominem and spanking.

Your definition of a 'sum' type is wrong.

The only definition I provide was “disjoint tagged union”. So that is already a false accusation because that is the definition of a sum type.

Also even if your intent was to say that my proposal was incongruent with what you think the established meaning of a sum type is, you could have simply been more amicable and open-minded by saying:

“The established use that I am aware of prefers that the options of a sum type are values. You are proposing that the options be types. This does not agree with the established tradition.”

To which I could have responded, “Oic, yes I remember that from our Subtyping thread discussion, but I am proposing another way to achieve a disjointed tagged union aka co-product type.”.

That changes it from a confrontation and spanking contest into a civil and even friendly dialogue. That is if your goal is for us to achieve something in teamwork.

The damn goal here is not to waste time arguing about minutia. We are here to accomplish a goal and stay focused on the technical ramifications of designs.

But if you’re goal is to lecture others and demean them, then you should just continue as you did.

This is not the first time. So I know already this pattern. I tried my best recently to avoid any confrontational miscues with my language.

keean commented 7 years ago

I think the distinction between sum types where the index is a tag (value), and TICs where the index is a type is a useful distinction. When I talk about sum types people know I am talking about the sum type as they exist in current languages like Haskell and ML. When I talk about TICs people know they are something different from what they are familiar with, and will go and look for the difference and find that the index is a type not a tag (value).

shelby3 commented 7 years ago

When I talk about sum types people know I am talking about the sum type as they exist in current languages like Haskell and ML.

As if your tiny world is the world.

If you think I would know what a damn TIC is, then your are not very realistic. I think you are just trying to turn this discussion into a clusterfuck. If you asked 1000 random Java programmers what a TIC is, maybe 1 or 2 would know what it is.

You are just side-tracking the discussion in ways that accomplish nothing but arguments.

I am trying to rush to accomplish a goal. I don't give a shit about all this noise. All I care about is understanding the ramifications of the designs. So that I can reach my goal.

keean commented 7 years ago

The only definition I provide was “disjoint tagged union”. So that is already a false accusation.

It is not because your definition of a disjoint tagged union is not congruent. There is an implicit assumption that a tag is a value that it seems you were not previously aware of.

“The established use that I am aware of prefers that the options of a sum type are values. You are proposing that the options be types. This does not agree with the established tradition.”

It is not a preference, it is a definition. Just like it is not a preference that '1 + 1 = 2'.

But if you’re goal is to lecture others and demean them, then you should just continue as you did.

I have no intent to demean you, and I think TICs can be very useful in a language, however like I said, the definition of a sum type is not a preference.

shelby3 commented 7 years ago

It is not a preference, it is a definition. Just like it is not a preference that '1 + 1 = 2'.

It is an adopted preference so as to prioritize global inference. Stop lying.

A sum type is generally a co-product, but apparently the tradition amongst your cohorts is to narrow it to constructions which enable global inference. Sorry but you are the one who is wrong per se (or at the minimum you do not have enough high ground to stand on to be insisting that I was far off in left field).

Types could also be tags and form a disjoint tagged union. That is congruent logic. It just will not achieve global inference. And note I proposed that my disjointed tagged union type is not a first-class nominal type (i.e. the use of type instead of data), so as to avoid recursion which I contemplated could possibly make unification diverge.

Just as our math axioms are preferences for certain attributes. And there are alternative math systems.

And here you are wasting this entire thread on arguing. You will never stop. I know you very well already. You are very obstinate. You will burn it all down and never realize how you could adjust to make things go smoothly.

It is not because your definition of a disjoint tagged union is not congruent. There is an implicit assumption that a tag is a value that it seems you were not previously aware of.

I am aware they are values from the discussion we had in the Subtyping thread last year.

I offered a new proposal which is congruent with the logic of a disjointed tagged thing which expresses a union and meets the requirements for a co-product type in the language. I even stated that I was not wrapping them as values and instead allowing them as first-class types and I asked if we could discuss the ramifications.

All you want to do is derail the discussion so you can prove something about what I know or do not know.

I was not prioritizing global inference as you know from what I have told you in many threads.

I think the distinction between sum types where the index is a tag (value), and TICs where the index is a type is a useful distinction.

And we could have been discussing that. But instead you wanted to argue. Because you are so damn certain that your definitions are absolute. But I just blew an agape hole in your arrogance.

keean commented 7 years ago

But I just blew an agape hole in your arrogance.

Really? All you seem to be doing is compounding your error to me. It's interesting how you changed your attack from claiming I had been beaten as a child to that I am arrogant after I messaged you about not being intimidated.

Still it could be I am wrong. I suggest you go and talk to some other people about sum types and see what they think. I will do the same, and then we can reconvene and agree to accept the common definition of 'sum type'.

shelby3 commented 7 years ago

after I messaged you about not being intimidated.

Did you? I guess in LinkedIn which I have not opened.

claiming I had been beaten as a child

I did not claim that. I said I am pondering that as a possible explanation for your behavior wherein you desire to clusterfuck things with spankings. I can not know why you do what you do. It is quite perplexing. I originally was under the assumption that you were an A-lister and would prioritize production instead of spanking.

Should I ask this as a question on LtU and see if there is any kind of consensus?

As if appeal to authority has any relevance whatsoever to the issue.

I do not care about what they think. All I care about is if the logic is correct. Fact is I have presented a design for disjointed tagged thing which expresses a union. It fulfills a co-product type. It will not remain at the origin of the Lambda Cube. Thus it will not enable principal typings inference.

The fact that you do not acknowledge that and move on, is indicative of B-lister clusterfucking.

Take care @keean. I am out of here. Wish me luck trying to do this without peer review (will be difficult as everyone makes mistakes). We simply can not work together. It is quite clear. We lose more time miscuing than we can justify from the insights we share. Well in the early stages, the insights were too numerous and valuable for me to walk away, but now reaching lower ROI and the Meta noise is just too much.

keean commented 7 years ago

I have looked up the type-theory, it seems I had forgotten that the type-dependent co-product is called a 'Simga Type'. So there is a short catchy name for you. You can look up the exact difference between a sum-type and a sigma-type:

https://en.wikipedia.org/wiki/Dependent_type

The opposite of the dependent product type is the dependent pair type, dependent sum type or sigma-type. It is analogous to the coproduct or disjoint union. Sigma-types can also be understood as existential quantifiers. Notationally, it is written as...

So you can call it a "dependent sum" type or a "Sigma" type.

Edit: actually that is wrong (see I can admit I am wrong) - the dependent sum is where the type of the term depends on a value, which is something that is not allowed without dependent types. What you are talking about is somewhere between a sum type, where the type does not change, and a dependent sum where it can depend on the value.

shelby3 commented 7 years ago

Okay @keean thank you for the exact terminology from the literature.

As that document says, they are all analogous. You could be more supportive to people instead of cutting them down with your words. You can simply say to someone that there might be a more precise term that is a better match, without insinuating that they are wrong. There are people who read who do not understand when you say “wrong”, what is the scope of your allegation. They may think “oh Shelby does not know WTF he is doing”, which is very far from the truth (well depends on what the task is … as obviously if the task is writing a research paper on type theory then I do not know WTF I am doing … but if the task is designing a simplified programming language, I have some knowledge).

Avoiding the negative words is very important for human interaction. There is usually a way to make a point without putting someone in an unnecessary negative light.

keean commented 7 years ago

I am happy to go on record as saying "oh Shelby does not know WTF he is doing” is wrong.

Yes they are all analogous, but exist at different places on the lambda cube. The best I can come up with is "Sum type with type level tags" but that's not very catchy...

shelby3 commented 7 years ago

Okay so your priority was to classify my idea in terms of pre-existing type theory, so you could reference the literature on it. Saying that and then stating that you want to get more to the precise term, would be a positive and constructive dialog. There is no way I would have reacted badly to such.

It is the flippant negativity that is so depressing.

keean commented 7 years ago

Here's an interesting paper on type-indexed types, and their uses in generic programming (something I am very keen on):

https://www.andres-loeh.de/tidata.pdf http://www.cs.ox.ac.uk/jeremy.gibbons/publications/typecase.pdf

shelby3 commented 7 years ago

Sitting here thinking a moment, maybe what I proposed is not disjoint. It depends what it is supposed to be disjoint. Since I proposed allowing types to be reused in unions (~and intersection~), then the unions are not disjoint from each other. I was thinking the disjointedness that matters is that the options of the union are all disjoint from each other within that specific union for which my proposal is congruent (well unless we allow intersection types to be first-class unwrapped as options in unions, but I was not proposing that although perhaps TypeScript supports it). Whereas, the values of a sum type can not be reused in another sum type (because they are second-class values and not first-class types), so even the sum types are disjoint from each other. One place that matters is for local inference. However it seems to be good in that it allows us to reuse typeclass implementations more.

So that would be an incongruence from the term disjointed tagged thing which expresses a union, but only if the disjointedness that matters is between the union types (not the options of the union type).

What you are talking about is somewhere between a sum type, where the type does not change, and a dependent sum where it can depend on the value.

Okay so making the types on tagged unions first-class has implications and it is not entirely disjointed. Now I get another hint as to why first-class intersection types can do strange things to a type system.

Edit: actually that is wrong (see I can admit I am wrong)

The literature apparently does not have a well known term for what I am describing so sum type is the closest well known term that is analogous. It behooves me to employ a well known term, so readers might have some clue what I am talking about. I would be happy to admit I was wrong, if I had been claiming that my proposal was the exact same as the sum types in Haskell, but I specifically said my proposal was different. Since I did not have more precise term to describe my proposal, my proposal falls into the general analog of a sum type as being co-products. I used the term that approximates what my proposal is for, i.e. for a co-product type.

The problem here is not admitting when we are wrong, but focusing too much on determining who is wrong instead of focusing our production on determining what is right, i.e. a positive use of our time instead of a negative one. Blame and such is really only needed for significant cases of damage, not for where some well intended guys are trying to relate their thoughts and produce something.

What I meant about the spanking is I had read or heard long time ago (maybe it was 100 years ago) that private, upscale British schools had a very strict discipline so I was thinking maybe some teachers had berated you for being wrong. Because I like to focus on building things, not tearing things down.

You are often quick to quip "that is wrong" as if I hear some teacher grading you. I see a lot of programmers do that these days, sort of like some ego contest. It is destroying production. I preferred the old times when we will helped each other and it was fun and supportive. At least that was what I felt when I was into computers in the 1980s or 1990s. Well maybe because I was working alone mostly. When I first landed on Lambda-the-ultimate, all they did was scold me.

Ditto nearly every forum I have participated in on the Internet. I must be a complete idiot right.

shelby3 commented 7 years ago

http://www.cs.ox.ac.uk/jeremy.gibbons/publications/typecase.pdf

That is probably analogous to my solution to the Expression Problem (extensible functions late binded to a type-indexed union) which I explained for the first time on the Rust thread and then clarified more in your Issues threads. So thus my idea had prior art.

keean commented 7 years ago

I was thinking the disjointedness is that the options of the union are all disjoint from each other.

I think the idea is in a disjoint union, all the tags must be different, therefore there can be no overlap between the options. It is the disjointedness that prevents flattening.

keean commented 7 years ago

That is my solution to the Expression Problem which I explained for the first time on the Rust thread and then clarified more in your Issues threads. So thus my idea had prior art.

Indeed, I met Jeremy at a workshop on "Datatype Generic Programming" at Oxford University back around 2004 around when the HList paper was published. The HList paper was all about type-indexed-types, and I could see how this enabled some really elegant programming, however trying to implement this in Haskell was fighting an uphill battle, because it was not the focus of the language nor its design committee.

shelby3 commented 7 years ago

@keean I want to work with you. But I am afraid we are going to repeat the same problem in our language to each other. We always do. It is difficult for you or I to change our personality or mannerisms. Maybe I could try a new tactic? When ever you write something that offends me, I could quote it and write "could you please rephrase that for me?". But if you reply with some obstinate insistence that "wrong is wrong" then we would not have made any progress on resolving it. If I instead try to just ignore it, it does not work because it accumulates. I really think a person's personality is take it or leave it. You are very knowledgeable. It is a shame if our personalities clash.

I do not have enough precision in my work on this to possibly satisfy your precision. I gain precision over time. When you talk with those other experts, they have specialized in this field with PhDs, and they speak your language with high precision. We are perhaps mismatched.

shelby3 commented 7 years ago

I could see how this enabled some really elegant programming, however trying to implement this in Haskell was fighting an uphill battle, because it was not the focus of the language nor its design committee.

If I do complete a simplified language, it will not hit every aspect of what you were thinking of doing, at least not at the outset. But overtime it is likely to trend towards something closer to what you want. In any case, it will at least it will have some things you want (e.g. typeclasses) so it would be a first step towards experimentation and see what works well and does not work well.

As you know my priority right now is on compatibility with transpiling to TypeScript and simplifying as much as possible.

I want to start the LL(k) grammar tomorrow. If I can't make this happen quickly, then I really need to abandon and put it on the back burner. So my available time for talking about design decisions has expired. If there are any more points, they need to be made immediately.

shelby3 commented 7 years ago

It is the disjointedness that prevents flattening.

The nesting of Either in Left and Right can not be flattened to the union of the wrapped types because of the wrapping.

I am do not yet clearly see how to reconcile that with your statement, but we may just have different conceptualization of terminology, so I am not sure if what I am thinking about is the same as what you are thinking about. Afaics, the disjointedness seems unrelated to flattening, i.e. (A & B) | (B & C) does not prevent any logical flattening. Whereas Either[Either[A,B], Either[B,C]] can not be flattened (from a 2 options with 2 options for each parent option) to the 3 options choice A | B | C.

@shelby3 wrote:

Now I get another hint as to why first-class intersection types can do strange things to a type system.

I do not see how to implement typeclasses for intersection types without having to implement every permutation manually (which is ridiculous and unacceptable). Thus I think I am pretty much decided to not allow intersection types (the programmer can use a nominal product type instead). For unions we can dispatch on the tagged option dynamically, i.e. all possibilities are covered by implementation of typeclasses for each possible option. Optionally the programmer can choose to implement a set of options, e.g. for Branch | Leaf, and the compiler will of course choose the best fit implementation.

For Sum types with options as values, those values can not be implemented for typeclasses individually because they are not types. So instead we implement the entire Sum type for the typeclass, then we must always use type-case logic, versus the aforementioned optional design pattern for tags-as-types. Thus the options as types (tags-as-types aka type-indexed-types) seems to interopt better with typeclasses.

The type-indexed-types seem to have more flexibility than the tags-as-values (aka Sum types) form of co-products. We lose the origin on the Lambda Cube but we gain the ability to not double-tag JavaScript types and the other advantages enumerated.

I am still wondering if there are any tradeoffs other than the loss of global and local inferencing already mentioned.

keean commented 7 years ago

Have we come to agreement on nomenclature? It seems you are looking at 'tagged unions' as opposed to 'disjoint tagged unions' (which are also known as sum types). This suggests 'non-disjoint sums' as a name that is consistent with the body of computer science publishing.

Regarding precision and working, I think it is important to focus on what is wrong. Your whole idea of 'non-disjoint sums' is not wrong, it was the naming. I suggest you focus on what specific part of what you said is wrong, Likewise I will try and be more precise about what is wrong. However I only have a small brain, and if I am half focused on not saying "you're wrong", I won't have enough left to solve the problems, but I will do my best to avoid it.

I would appreciate if you could cut down on the ad-hominem attacks in return, which don't help further the discussion, and probably undermine your credibility with other readers.

keean commented 7 years ago

For Sum types with options as values, those values can not be implemented for typeclasses individually because they are not types. So instead we implement the entire Sum type for the typeclass, then we must use type-case logic. Thus the options as types seems to interopt better with typeclasses.

We don't need to use type-classes because the 'case' statement, and normal pattern matching work.

In the cases were we do, Haskell cannot do this, but Datatype generic programming allows type-classes to be declared using the structure of the types.

Using type-indexed-coproducts might be a better solution if it was built into the language, this is what we were investigating in the HList paper. What I don't yet know is if this mechanism is sufficient to write elegant boiler-plate free programs on its own, or if the traditional sum types are needed as well. Really the only way to determine this is to create an experimental language with just the experimental feature (as design is more about what you leave out) and see what it is like to program with without the other features.

keean commented 7 years ago

I do not see how to implement typeclasses for intersection types without having to implement every permutation manually (which is ridiculous and unacceptable). For unions we can dispatch on the tagged option dynamically, i.e. all possibilities are covered by implementation of typeclasses for each possible option.

An intersection type represents a function is a combination of other functions like Int -> Int /\ Float -> Float as such it is an alternative representation of a type class, or a module, but as a first-class value. So given:

f(x) => (x(3), x('a'))

we can easily infer the type: (Int -> A /\ String -> B) -> (A \/ B)

What if we passed the function as a dictionary (record):

f(dict) => (dict.x(3), dict.x('a'))

Now we can make dict an implicit argument:

f(implicit dict) => (dict.x(3), dict.x('a'))

So a type-class is kind of an implicit module, and a module is a nominally typed intersection type.

What does it mean to have a type-class of type classes? It seems like nonsense, but in that case if our type system allows type-classes of types, and permits intersection types, does that mean it admits nonsense?

One argument to avoid intersection types is that if a 'typeclass' is something other than a 'type' then you cannot create nonsense, because typeclasses only range over types, and not over themselves.

This seems vaguely similar to set theory, and the Russel paradox. If we use intersection types, we are collapsing everything into one level, like set theory, and that probably means there are problems (infact we know it is incomplete and some unifications never terminate). Type classes have a stratification that prevents this, and also limits us to one meta-level (that is programs that create programs, not an infinite stack of programs creating programs).

keean commented 7 years ago

I think to work out what we want regarding intersection type, you have to consider the bottom level, what is the data and what is the memory layout.

Simple values like 'Int' and 'Float' are easy. Structs make sense.

Union types do not make sense, because we cannot interpret a word from memory if we do not know whether it is a Float or an Int. We need some 'program' to interpret the data based on some 'tag' that is stored elsewhere in memory. In other words 'unions' are not a primitive type to the CPU.

Intersection types do not make sense either. You do not have machine code functions that can cope with Ints or Floats. You can have generic machine code, for example 'memcpy' can copy an object of any type as long as we know its length. In other words pointers are better modelled as universally quantified types than intersection types.

Structs (objects with properties) each of which can be typed make sense.

Type-classes make sense, where we know the type of something we can select the correct method to use.

All this assumes an 'unboxed' world like that of the CPU.

Another way to think about this is that the computer memory is filled with bits like

11001010101011001100101101000011

To interpret these bits we need to know how they are encoded. Static types represent the implicit knowledge of how they are encoded based on their location (the static refers to lack of motion, hence fixed location).

To decode dynamic types, we need to know how to interpret the encoding of the type, so we run into the question, what is the type of a type. To avoid the Russel paradox we cannot answer "the type of a type is type", so we need something else. My answer to this is that 'tags' encode runtime type information, and the 'sum' datatype is the type of the tag + its data.

keean commented 7 years ago

To make union types make sense we would need a standard encoding for all types. This means we must have at least partial structural typing.

The primitive types we need to start with would be:

unsigned int 8, 16, 32, 64
signed int 8, 16, 32, 64
float 32, 64

Maybe some bigger ones for future-proofing, and then some vector types for SIMD:

unsigned 2x64, 4x32, 8x16, 16x8 (possibly others)
signed 2x64, 4x32, 8x16, 16x8 (possibly others)
float 2x64, 4x32

That would do for basic types. Unlike static typing where we can represent these in the compiler, we need these to actually be written to memory for use are runtime.

We then need an encoding for objects (which are tagged product types) and unions.

For your solution to work you will need to work all these codings out and include them in the language runtime.

My solution above, only allows static types based on location, and the implementation of unions would be layered on top of this, implemented as DSL library in the new language.

shelby3 commented 7 years ago

@keean wrote:

This suggests 'non-disjoint sums' as a name that is consistent with the body of computer science publishing.

Well that suggests my original understanding that in math a ‘sum’ is just a co-product (as I cited Filinski in 1989 wrote about sums versus products in the field of computer science). That is why I just assumed that the popular term Sum type, applied as a general name for any co-product type. And I was thinking that Haskell’s algebraic types are disjoint tagged unions as one possible variant of a sum type. The mainstream sources do not seem to clarify this unambiguously. You cited some literature which was also somewhat vague on the historic delineation of the use of the terminology.

That is why I have stated to you in private that I think “this is wrong” is not very productive. It is more productive to try to go develop a very well thought out statement which recognizes what your colleague has correct and builds upon it, in way that is not just some quip but rather imparts information for your colleague and readers. In that way, your colleague will not view it as unproductive ego contests.

I would like to cite my recent post as an example of such a statement which does not bother with bluntly saying “you are wrong”:

@shelby3 wrote:

It is the disjointedness that prevents flattening.

The nesting of Either in Left and Right can not be flattened to the union of the wrapped types because of the wrapping.

I am do not yet clearly see how to reconcile that with your statement, but we may just have different conceptualization of terminology, so I am not sure if what I am thinking about is the same as what you are thinking about. Afaics, the disjointedness seems unrelated to flattening, i.e. (A & B) | (B & C) does not prevent any logical flattening. Whereas Either[Either[A,B], Either[B,C]] can not be flattened (from a 2 options with 2 options for each parent option) to the 3 options choice A | B | C.

Regarding precision and working, I think it is important to focus on what is wrong.

Disagree. As I have stated in private, I think it is important to focus on stating what is correct. It is more than just a subtle difference. I am not saying to ignore correcting what is wrong. I am saying that it is very lazy to quip “that is wrong”. To well articulate what is correct and also recognize what your colleague has stated correctly, requires production and effort. I prefer to see production and than cutting each other down. I get depressed when I see negative activity and I end up lashing out in return. Because I sort of adjust to the way the people are who are around me. If people want to be negative, then I after giving them every chance, and they insist then I let them have a boatload of negativity. But I have decided that in the future, I will just put such people on Ignore. I realize that is a flaw in my personality to argue with people who are negative.

I am a Cancer zodiac sign. This means the mood and ambiance of the workplace is very important for me to be productive. I like uplifting, positive people (but not in lack of substance way, i.e. I am not an air zodiac sign). I am an earth sign, meaning I need warmth of relations. I do not co-exist well with cold people. People do have different personalities and we just have to accept that not all personalities can mesh.

but I will do my best to avoid it.

I also have committed negative communication at times. Most recently I have been trying to apply the effort to be more careful about what I write in public. I will quote what I wrote to you in private:

@shelby3 wrote in private:

IMO, whereas if we explain something very well and very fairly explicitly recognizing the areas where the other is correct, and that other individual insists with their wrong claims and refusing to recognize or clearly refute with a better explanation, then that is the time we can put that troll on Ignore.

@keean wrote:

I would appreciate if you could cut down on the ad-hominem attacks in return, which don't help further the discussion, and probably undermine your credibility with other readers.

Yeah I agree not to let you or anyone else bait me into calling out and then lashing out when my pleas fail. As I explain below, I will learn to just walk away from situations that do not mesh.

I do not know if I can ignore if you continue to think that blunting pointing our wrongs without well developing statements of what is correct with fairness, and thus we would slide back into the same flame wars.

As I wrote above as quoted from private, I decided recently that I just have to learn to Ignore such people entirely, meaning ending all communication with them. It does not mean that they are incapable of having a conversation with others. Different people have different levels of tolerance for communication styles.

Having said that, if I felt someone had a treasure map (i.e. offer me extremely valuable information), I would probably bite my tongue and be exceedingly nice while they said anything they want to say about me or my ideas.

tl;dr: I do not think we solved the problem. So I will drastically reduce my interaction and try to be very judicious about the topics I discuss going forward with you in public (private communication is okay you can say whatever you want there, lol). But I want to make it clear that in no way is this a statement of judgement about you. I am not accusing you of being wrong for your style of communication. Diversity of personalities makes the world a more fertile soil.

keean commented 7 years ago

I do not know if I can ignore if you continue to think that blunting pointing our wrongs without well developing statements of what is correct with fairness, and thus we would slide back into the same flame wars.

If I say an idea you post is wrong, there is no other useful response apart from to try and convince me you are right. I have not insulted you, am am merely stating an opinion about the correctness of an idea. You have no right to object to my opinion about your idea, or to be 'offended' by it. It is after all just my opinion. In fact I should be flattered you are offended by it, because that means you hold me in such high esteem my opinion seems like a fact to you.

If you then escalate into ad-hominem attacks it looks like you have no convincing arguments that the idea is infact correct, and are trying to derail the discussion.

shelby3 commented 7 years ago

@keean your reply is seems to indicate to me (IMO) why the incongruence in our personalities and communication styles will never be solved. I do not think you understand (or you disagree with) my point about how to be positive versus negative. But that is okay. We can move on. I will be very, very judicious about the topics I participate in from here forward.

Thank you for taking the time to respond on all issues including the meta ones.

shelby3 commented 7 years ago

Really the only way to determine this is to create an experimental language with just the experimental feature (as design is more about what you leave out) and see what it is like to program with without the other features.

Good so if I proceed with that feature, then it will help you also by being the test bed. That is encouraging and inspiring because I know if you were inspired to write libraries, you have a lot of knowledge about Alexander Stepanov’s Elements of Programming models.

shelby3 commented 7 years ago

My solution above, only allows static types based on location, and the implementation of unions would be layered on top of this, implemented as DSL library in the new language.

I visualize some potential tradeoffs of your suggested design choice:

The DSL will add boilerplate. Perhaps you can hide most of it in libraries, I dunno.
Unless your language exposes low-level details, your DSL will not be able to make some optimizations such as employing the free space of tagged pointers for the tags. Analogously, for targeting JavaScript, unless your language expose instanceof and typeof, then your DSL can’t avail of the built-in tags.
To make certain optimizations, your DSL may be output target specific thus not portable. This indicates that the DSL logic might be in the wrong abstraction layer.

There might be some advantages as well, i.e. the analysis above may not be complete.

For your solution to work you will need to work all these codings out and include them in the language runtime.

For targeting JavaScript, the reified type tags are always there any way with instanceof and typeof, unless one is creating anonymous (i.e. non-nominal, structurally typed) objects.

Or even as I had proposed before upthread for JavaScript, since typeclasses should only be implemented one way for each type, these typeclass methods could be placed on the Object.prototype. When modules load, they have to add their typeclass implementations on the relevant prototypes.

keean commented 7 years ago

The DSL will add boilerplate. Perhaps you can hide most of it in libraries, I dunno.

If it does add excessive boilerplate, I will have failed. The compiler should do the work, although there might be a keyword involved to control when this happens.

The way I see this happening is that accessing the contents of a container should be a type-class based operation, that allows user defined containers to override the normal behaviour, so a user defined container is just as short to use as a built in type like an array.

Unless your language exposes low-level details, your DSL will not be able to make some optimizations such as employing the free space of tagged pointers for the tags. Analogously, for targeting JavaScript, unless your language expose instanceof and typeof, then your DSL can’t avail of the built-in tags.

Again I see type-classes facilitating this. Dereferencing would be overridable, so you can write a container where the type is encoded in the pointer if that is what you really want.

To make certain optimizations, your DSL may be output target specific thus not portable. This indicates that the DSL logic might be in the wrong abstraction layer.

If you put the encoding inside the compiler, it will require patching the compiler itself to support a new target. This seems worse than just having to edit a library.

It does make me wonder if WebAssembly (asm.js) would be a better target, because it would make it more like native targets and less dependent on the strangeness of JavaScript.

Maybe you are right and it would be better to include native boxing in the language. I want to keep the core language as small as possible, hence why I was thinking this should be in a library, but maybe its important enough to be in the core.

shelby3 commented 7 years ago

@keean wrote:

Dereferencing would be overridable, so you can write a container where the type is encoded in the pointer if that is what you really want.

But as I wrote before, that forces you to expose low-level details in the language, e.g. pointers. I do not want any pointers in my language. K.I.S.S. principle and also encapsulation of details which for example are sandbox security holes and many other reasons.

If you put the encoding inside the compiler, it will require patching the compiler itself to support a new target. This seems worse than just having to edit a library.

The compiler could be modular so that post processing of the AST to different targets can be swapped. Essentially it is akin to that the compiler can have libraries. If you put the libraries above the language instead of below, it seems to me to be in the incorrect abstraction layer.

It does make me wonder if WebAssembly (asm.js) would be a better target, because it would make it more like native targets and less dependent on the strangeness of JavaScript.

I presume you know that WASM is not the same as ASM.js. Afaik, WASM does not run everywhere JavaScript does yet, and it is not a mature technology yet. ASM.js is low-level and afaik a nightmare to debug in the browser, although perhaps source maps could help with that somewhat, it still would not be as high-level intuitive as debugging in JavaScript. Also we need JavaScript’s GC.

Every output target has some strangeness.

Maybe you are right and it would be better to include native boxing in the language. I want to keep the core language as small as possible, hence why I was thinking this should be in a library, but maybe its important enough to be in the core.

I agree of course that separation-of-concerns and modularity are excellent design concepts. Yet exposing the low-level details necessary to optimize boxing in a library above the language seems to be conflating abstraction layers and thus not achieving optimal abstraction. I could also for example envision issues about compiler selected optimized binary unboxed data structures versus boxed members of data structures. I am planning to have a language feature to map between them, because otherwise we need to expose low-level JavaScript details such as ArrayBuffer and TypeArray above our language.

Just because we can put/hide details in a library above the language, does not prevent that the low-level details exposed above the language from allowing complexity to seep into userland code. The users will take advantage of any primitives exposed in the language, as flies are to honey. So attaining simplicity is not just about what is left out of the compiler, but what is also left out of the language. Libraries (aka modularity) can be above or below the language. It is all about the abstraction layer.

keean commented 7 years ago

The other point of view is that you don't want to lock the language into some type encoding that will prevent future extensibility.

Using bits in pointers to encode things limits porting to platforms that have different alignment requirements, and is probably a bad idea, and it is not enough space to encode all the types, so there would be some types that just don't benefit. Remember structural types are of unlimited length (as they have to encode the structure) whereas nominal types can be encoded as an integer.

So much approach would probably not use fancy pointer encodings, and from my experience with optimization, I don't think it will cost much performance, as CPUs are optimised for integer word performance. It's going to mess up pre-fetch and caching too.

My approach would be to use static typing wherever possible, so that the use of dynamic types is restricted to where it is really needed.

shelby3 commented 7 years ago

The other point of view is that you don't want to lock the language into some type encoding that will prevent future extensibility.

Who proposed that? I certainly did not. I proposed there is no specification of these low-level details “above the language” and thus the language compiler is free to optimize for each target.

My point is do not expose complex, low-level details above the language, so the compiler can optimize for each output target and so the programmers are not given access to complexity that can make the code like Scala or C++ with its 1500 page manual and abundance of corner cases.

Using bits in pointers to encode things limits porting

That criticism thus does not apply.

So much approach would probably not use fancy pointer encodings, and from my experience with optimization, I don't think it will cost much performance, as CPUs are optimised for integer word performance. It's going to mess up pre-fetch and caching too

Compiler will be free to optimize whatever is tested to be most optimum. Btw, I am not 100% sure that unused bits of 64-bit pointers affect pre-fetch and caching.

My approach would be to use static typing wherever possible, so that the use of dynamic types is restricted to where it is really needed.

I do not know what caused you to mention this (?), as it seems so broad and not specifically related to the discussion we were having.

A union or sum-like type is not statically dispatched, although its bounds are statically typed.

Of course we statically type what we can that makes sense in terms of the priorities, but when you need union then you need it.

I hope we do not go on and on, just for sake of seeing who can be the last one to reply. What is your cogent point overall?

keean commented 7 years ago

Who proposed that? I certainly did not. I proposed there is no specification of these low-level details “above the language” and thus the language compiler is free to optimize for each target.

But you would lose binary compatability, and the ability to transfer data between machines (as they may have different versions of the compiler, or have different CPU and therefore type representations differ).

keean commented 7 years ago

Of course we statically type what we can that makes sense in terms of the priorities, but when you need union then you need it

Not necessarily, it is common in languages to use dynamics types everywhere, even when not needed, for example Java does this (every method is automatically virtual).

JavaScript also does this, and then tried to optimise it all away in the JIT compiler. The problem is it is all too easy to prevent the JIT compiler from being able to optimise by using a dynamic feature (like changing the type of a property) when you do not need to, you could for example use a separate property.

The C++ "virtual" keyword may seem like boilerplate, but it serves a useful purpose, that is it makes it clear when you are paying the cost. So a non-virtual method always dispatches at the fastest speed, and cannot be slowed down or break the optimiser, but it also cannot be dynamically dispatched. By making it virtual there is still the possibility it could be optimised, but you are allowing it to use the slower mechanism because you need dynamic dispatch.

This follows the principle of only paying for what you use, and making the cost visible in the source code. I think this is an important principle that languages I like follow.

shelby3 commented 7 years ago

But you would lose binary compatability, and the ability to transfer data between machines

Such communication must be serialized to a standard protocol/format.

Not necessarily, it is common in languages to use dynamics types everywhere, even when not needed

What specifically does this have to do with the discussion we were having?

I agree that one of the design priorities can be to minimize accidental dynamic typing. Do you see a specific instance that applies to our discussion? Afaics, as I already wrote in my prior comment post, we were not discussing whether union types can be statically dispatched (because they can not in any design other than dependent typing), but rather about the other aspects of the design of a “sum-like” type.

shelby3 commented 7 years ago

@keean wrote:

I have found a simple way to write single parameter typeclasses in TypeScript, which you can do by hand, or as a transpilation target. You can extend the interface of a class like this:
// declare the datatype:
class X {
  someproperty : number
}

// declare the typeclass:
interface Showable {
  show() : string
}

// create an instance:
interface X extends Showable {}
X.prototype.show = function() {
  return someproperty.toString();
}
This works because it merges the interface declaration for X with the class declaration for X, we then need to provide the implementation on the prototype for X so it is inherited by all instances of X. It only works for single parameter typeclasses, but it sort of solved the expression problem as we can declare a new instance to add a new type to an existing typeclass, and we can create a new typeclass for an existing type. Because TypeScript is dynamically typed, we don't need to worry about heterogenious collections, but of course there is no static guarantee that all objects in a collection implement any given typeclass, so its easy to get a runtime crash when you insert an ‘un-Showable’ object into a collection that gets ‘shown’ at some point.

That is similar to the idea I had proposed originally for using the prototype chain to simulate typeclasses. You’re showing it is possible to get TypeScript’s typing system to somewhat integrate with my idea. Thanks!

shelby3 commented 6 years ago

TypeScript may be close to get HKT typing:

https://github.com/Microsoft/TypeScript/issues/1213

shelby3 commented 6 years ago

@keean replied:

I replied:
@keean replied:
I wrote:
Btw, when a zero argument method returns a value and we are not discarding the result, do we want to allow as Scala does?
f(x: ILength) => x.length
Instead of:
f(x: ILength) => x.length()
I think leaving off the () on zero argument functions is not a good idea, because those functions execute. How do you pass a zero argument function as an argument to another function if you do not indicate when it is to be executed. You effectively lose control over the order in which side-effects happen.
Based on the type it is being assigned to, although there is a corner case where the result type is the same as the type being assigned to.

I think Scala allows the length(_) in that corner case. We could require this always for passing the function instead of its result value.

On the positive side, it does help to get rid of parenthesis more times than it will cause confusion.
That's worse :-( I think function application.needs to be consistent not some backwards special case, we seem to be introducing corner cases into the grammar. Scala is not a good model to follow.

Revisiting this I realized that instead of length(_), Zer0 could instead not allow discarding the () for function application when the return (aka output) type of the function is the same type as function.

Also application of non-pure functions will not allow discarding the ().

The advantage of discarding the () for pure functions is that it reduces the symbol soup of parentheses and it’s more sensible because a pure function has no side-effects and thus is analogous to reading a value. I propose it should be required that the () be discarded for pure functions so these are accessed consistent indicating they’re values.

Consider:

if (array.isEmpty)

Instead of:

if (array.isEmpty())

if (isEmpty(array))

Even if we have block indenting:

if array.isEmpty
   block
if array.isEmpty()
   block
if isEmpty(array)
   block

The symbol soup reduction still applies when applying functions:

foo(array.isEmpty)

foo(array.isEmpty())

foo(isEmpty(array))