derive4j / hkt

Higher Kinded Type machinery for Java
BSD 3-Clause "New" or "Revised" License
81 stars 9 forks source link

HigherKind machinery à la highj with associated annotation processor #1

Closed jbgi closed 8 years ago

jbgi commented 8 years ago

This means the _ classes at the root of https://github.com/DanielGronau/highj/tree/master/src Should be renamed to __ or other name (hk?) to avoid javac warning.

The annotation processor should check what @gneuvill described in https://github.com/functionaljava/functionaljava/issues/126#issuecomment-201042557 to ensure acceptable type-safety.

Additionally it would also check that a narrow/coherce method exists with the right type signature.

@gneuvill Would you be up to the task?

gneuvill commented 8 years ago

renamed to __ or other name (hk?)

I actually quite like __ as a name ; it reads nicely in, for example :

interface Functor<f> {
  <A, B> __<f, B> fmap(__<f, A> fa, F<A, B> f);
}

h or hk would do, but they are a bit more noisy to my eyes... For multi-parameters type constructors, I would suggest __2, __3 and so on, rather than ___, ____ which seem unwieldly. What do you think ?

Note that I am also rather attached to the convention of denoting a type constructor by a lower case letter. What's your take on this ? Do you think it's important ? Should the apt enforce it, or at least emit a warning in case it is violated ?

Additionally it would also check that a narrow/coerce method exists with the right type signature.

By the way, what about the generation of it ? Should it be left to derive4j ? In that case, the check would be disabled for @data annotated types ?

Would you be up to the task?

Sure ! I'll try to put something together next week. I'll be on vacation after that, so I won't be able to get back at it until middle april. Is that okay with you ?

jbgi commented 8 years ago

By the way, what about the generation of it ? Should it be left to derive4j ? In that case, the check would be disabled for @data annotated types ?

=> yes, derive4j will eventually generate it. To avoid special-casing and also avoid compile error when the narrow method is not yet written, I think the annotation processor should not mandate the presence of the narrow method. It should only check that existing static methods have a compliant signature regarding the hkt encoding and parametricity.

For multi-parameters type constructors, I would suggest __2, __3 and so on, rather than ___, ____ which seem unwieldly. What do you think ?

I think I agree but I'd rather avoid using this classes for now, and introduce them only if necessary. Eg. for now, Either hkt can be encoded only with __:

class Either<A, B> extends __<__<Either.µ, A>, B> { ... }

WDYT?

Thanks for working on this! of course, take the time you want! (and I will also be away for three weeks in April).

gneuvill commented 8 years ago

avoid using this classes for now, and introduce them only if necessary

Fine by me. But you said classes ; I take you meant interfaces, didn't you ? (so that would be implements in your example)

jbgi commented 8 years ago

yes, interfaces.

gneuvill commented 8 years ago

Hi Jean-Baptiste,

master is now in a workable state so if you want to give it a try, I'd be glad to know what you think of it !

The implementation went smoothly, apart from one problem : I had to depend on com.sun.source.tree (javac) API. Indeed, jsr 269 doesn't provide access to constructor or method bodies ; yet, one could still declare a type in such a local element that could implement one of our __ interfaces : we thus have to check those and the only way I found was by using an existing bridge between jsr 269 and javac.

Concretely, this means that this very project must be compiled on oracle|open jdk only, but also any library that makes use of it. Note that we clearly speak of compilation ; runtime is unaffected.

Lastly, and I hope you don't mind, I introduced types __2..__5 in the project. I did this because... well because it made implementing the type checking and error reporting easier to me ; it could certainly be reworked if need be though.

That's it, let me know what you think !

jbgi commented 8 years ago

Thanks, Great work! I will test it tonight!

jbgi commented 8 years ago

As for __2...__5 they are perfect. what I wanted to avoid is using a inner class as witness in __2 like in https://github.com/DanielGronau/highj/blob/master/src/main/java/org/highj/__.java

jbgi commented 8 years ago

As you mention, javac is currently mandatory. I think we should also support ecj, if necessary by disabling checks on local classes. After that the road to HKT in Java is wide open. (It would be interesting to port highj to use derive4j-hkt and see if all is good). Great job!

gneuvill commented 8 years ago

As for 2...5 they are perfect. what I wanted to avoid is using a inner class as witness in 2 like in https://github.com/DanielGronau/highj/blob/master/src/main/java/org/highj/.java

Yep. It seemed more logical to thread the same "witness" type all along. But there may be reasons why @DanielGronau did the way he did... We'll see once we'll be ready to implement a type classes hierarchy for java !

As you mention, javac is currently mandatory. I think we should also support ecj, if necessary by disabling checks on local classes

Hmm... I forgot about those poor eclipse users. Well, that would mean a separate artifact for them, don't you think ?

Great job!

Thanks ! Hope it we'll actually be used !

gneuvill commented 8 years ago

And about the "witness" type : note that since hkt's apt does not actually rely on any annotation (it is declared as @SupportedAnnotationTypes("*")), we could mandate anything as such a witness, in particular a wildcard as in :

class Maybe<T> implements __<Maybe<?>, A>

but the µ inner type seems preferable in the case of multi-parameters types (where wildcards parametrised types become hard to parse visually)

So I'd rather stick with the µ type... Thoughts ?

jbgi commented 8 years ago

I think the processor should be permissive, to ensure wide adoption from people with various convention. Most of the time, the actual witness will ne be present in type signature: only a generic type parameter will be used, so being easily visually parsable is not that important.

So I think both approach should be valid:

jbgi commented 8 years ago

As for ecj, support should done in the same processor are it would lead to complicated setup for end-users. I think all compiler specific code should co into a static inner class. By reflection or other mean we test what is the current processor before using the corresponding code of a static inner class (or separated class) of the processor.

jbgi commented 8 years ago

Also I'm wondering if the processor should be shipped with the hkt types, to simplify dependency management and to minimize the risk of forgetting to add the processor artefact as dependency. Of course it would be an unused runtime dependency but maybe it's worth it (given that the processor is relatively light). WDYT?

DanielGronau commented 8 years ago

But there may be reasons why @DanielGronau did the way he did...

Probably not. At the time I realized that I need a witness class that could hold another type parameter, and I put it as inner class in __. I don't remember if I even tried the recursive approach, using _ as its own witness was probably too much for my brain. I changed the implementation in highJ, which looks much nicer and seems to make no difference.

BTW: In in the current context I see no real advantages of the "inner class" solution over the F-bounded-like approach. The story behind µ is quite trivial, there was no "theory" involved: I started out with a separate "wrapper" class like ListTC, with the methods static <A> _<ListTC, A> wrap(List<A> list) and static <A> List<A> unwrap(_<ListTC, A> listTC). The advantage was that the wrapped class could be completely unaware that it was used as HTK, with the obvious disadvantage that both conversions had to be done manually. Later I realized that I don't need the wrap method with the right signature of the wrapped class. So unwrap became the narrowmethod in the wrapped class, and the ListTC class was just "slurped inside" and degraded to a humble µ.

I would really love to see an implementation of HKT in Java which is safe at compile time.

jbgi commented 8 years ago

@DanielGronau thanks for your input! I see that your last highj commit prove that the recursive approach does not induce problems. By the way, @DanielGronau, would you prefer this module to be under org.highj? (as the original idea is from your project). I think it would be nice but it would then requires to:

while all of this is already present under org.derive4j. In any case, it would be great if highj could eventually use the work done by @gneuvill to ensure type safety.

clinuxrulz commented 8 years ago

I'll repeat my question here, so others can view.

How do people feel about using the raw type as a witness?

E.g. Either<String, Integer> becomes __2<Either, String, Integer>.

DanielGronau commented 8 years ago

I don't think the module should move. HKTs have so many applications, and highJ is just one example. My code and my ideas are free to use, so I feel appreciated by mentioning highJ, and I'm exited to see something really interesting evolving from it.

@clinuxrulz Personally, I would prefer the version with wildcards, but I have no strong opinion about this. However, maybe in a distant future raw types will be banned from Java?

jbgi commented 8 years ago

@DanielGronau thanks.

@clinuxrulz I think __2<Either, String, Integer> could totally be an alternative to __2<Either<?,?>, String, Integer> and should be allowed by the processor, as long as the eventual warnings are accepted/handled by the developer.

danieldietrich commented 8 years ago

Hi @all!

my 2 cents:

I looked at the comments. My impression is that you discuss about the notion of things but it is not 100% clear to me what options we have and what are the pros and cons of each.

Also a concrete example would help. E.g. the code of an annotated class. Then the generated code. And the code that uses the generated code.

the F-bouned-like approach (<Maybe<?>, A>) in particular because some lib (javaslang) already use that convention and it would be great if we allow, eg https://github.com/javaslang/javaslang/blob/master/javaslang/src/main/java/javaslang/Kind1.java to extends our interface without impact in their code.

You know, Javaslang currently has no dependencies other than the JDK...

One thing came to my mind: Do you have a solution for expressing kinds in a type hierarchy?

Traversable<T> // __<Traversable<?>, T>
      ^
      |
    Seq<T> // __<Seq<?>, T>
      ^
      |
    List<T> // __<List<?>, T>
     /  \
Cons<T> Nil<T>
jbgi commented 8 years ago

@danieldietrich the scope of this module is only to provide the __*<> interfaces and an associated annotation processor which only role is to ensure type-safety: no generated code at all. It should thus be very light and acceptable dependency for various lib (given the interoperability gains).

And no, only one "kind" can be defined for a given type hierarchy. But I consider this not to be a problem given that a main usage of HKT is to use type classes as an alternative to subtyping. So in your example only List is "kinded" (I see this is what is currently done in javaslang).

danieldietrich commented 8 years ago

Thanks for clarification :) I understand now I'm now offline for one week but will come back after vacation...

gneuvill commented 8 years ago

Hi all,

Nice to see there's some interest in this little attempt of ours ! Now, to the points :

I think the processor should be permissive, to ensure wide adoption from people with various convention.

That very word, 'permissive', frightened me the first time I read it, but I assume we all agree that a compiler should be anything but 'permissive' (and as it stands, the hkt apt processor behaves like a poor man's compiler plugin). So I would like to stress the fact that the more representation of type constructors we allow, the more complicated the 'type checking' logic will get and thus, the more room for errors and unsafety will be made.

That being said, I understand the contention, but I would like to submit an alternative way to resolve it : why not gathering as much opinions as possible on what an ideal encoding should be (given java's possibilities, of course), settle on it, and then enforce it as strongly and mercilessly as possible with the apt processor ?

Most of the time, the actual witness will not be present in type signature: only a generic type parameter will be used, so being easily visually parsable is not that important

I'm not sure about that... As a matter of fact, having tried to implement type classes instances for IndexedStateT[F[_], -S1, S2, A] (from scalaz) with the wildcard based encoding, I can assuredly say that I don't like it ; but it's just me.

gneuvill commented 8 years ago

Several encodings have been presented in this thread to designate type constructors (or more precisely, the 'witness' type of the type to be lifted to a type constructor). I reference them below and give them names to facilitate further discussions :

  1. the 'µ' encoding (a) : class Maybe<A> implements __<Maybe.µ, A>
  2. the 'wildcard' encoding : class Maybe<A> implements __<Maybe<?>, A>
  3. the 'raw type' encoding : class Maybe<A> implements __<Maybe, A>
  4. the 'F-bounded' encoding (b) : like the wildcard encoding I guess but with __ defined as __<f extends __<f, ?>, A>.

(a) or any other name than 'µ'

(b) If the decision were made to not settle on a canonical representation of the witness type, I could rather easily rally to it but on the other hand, the form of the __ interfaces should be carved in stone, shouldn't it ? So either F-bounded, either not. What benefits do recursive types provide in our context ?

gneuvill commented 8 years ago

Also I'm wondering if the processor should be shipped with the hkt types

In my view, it imperatively should be, indeed ! (why the distinct sub-modules then ? Well, because my gradle foo is next to none and I blind-fully followed the structure of the derive4j project. But, If need be (that is, if gradle is unable to merge them at publication time), we could reunite the types and the processor in one module, let's call it 'core' ?)

gneuvill commented 8 years ago

As for ecj, support should done in the same processor are it would lead to complicated setup for end-users. I think all compiler specific code should co into a static inner class. By reflection or other mean we test what is the current processor before using the corresponding code of a static inner class (or separated class) of the processor.

Yep. Because of import statements, I guess we'll be forced to use separate classes. Do you happen to have any pointer to ejc api documentation ? (I've never been able to find my way through the gazillions of eclipse sub-sites, wiki, whatever which are as well organised as their IDE)

jbgi commented 8 years ago
  1. the 'F-bounded' encoding (b) : like the wildcard encoding I guess but with defined as <f extends __<f, ?>, A>.

clearly, this one is out. As you says it will change the __* interfaces in a way that both prevent inheritance between them and also forbid other representation of the type constructor witness.

And when I say permissive it should of course not compromise type safety. The thing is that it will be difficult to agree on a particular style for the witness. But yeah, I think 1. with allowing other identifier than µ is the more "scalable" representation. We will look into adding support for 2. and 3. in a future release if interest in confirmed.

we could reunite the types and the processor in one module, let's call it 'core' ?)

How about simply hkt?

Because of import statements, I guess we'll be forced to use separate classes.

I think import statements do not come into play when deciding class loading: it is only when a class is actually needed by running code (including static initialization code) that it is loaded.

Do you happen to have any pointer to ejc api documentation ?

I think that skipping hkt type-safety checks for local classes is ok when ecj is used: In the immense majority of the setups, the code will be compiled with javac under CI anyway. So for the first release I think we should just skip those checks when a non javac compiler is detected. WDYT?

gneuvill commented 8 years ago

Hi Jean-Baptiste,

clearly, this one is out.

Ok, so the __ interfaces stay as they are.

And when I say permissive it should of course not compromise type safety. The thing is that it will be difficult to agree on a particular style for the witness. But yeah, I think 1. with allowing other identifier than µ is the more "scalable" representation. We will look into adding support for 2. and 3. in a future release if interest in confirmed.

We're on the same page so, that's good. How do you consider the 'configurability' of the processor ? Annotations I suppose ? As in :

@HKT("h")
class Maybe<A> implements __<Maybe.h, A> {
  public static final class h {}
}

No annotation would mean that the default value (µ ?) would be used as the witness type name ?

I think that skipping hkt type-safety checks for local classes is ok when ecj is used: In the immense majority of the setups, the code will be compiled with javac under CI anyway. So for the first release I think we should just skip those checks when a non javac compiler is detected.

Sounds like a good plan.

As I said, I won't be able to tackle this until two weeks ; of course, if you want to have a crack at it before I do, go ahead ! But you said you also would be off for a while, so have nice holidays !

clinuxrulz commented 8 years ago

I'm tilting towards the 'µ' encoding, with the ability to use something other than 'µ'. As it is a pain on US-keyboards.

Can the annotation processor be pushed further to maximize the convenience to the end user? Like so:

@HKT("h")
class Maybe<A> {
}

Then let the apt generate the inheritance and the witness class automatically.

jbgi commented 8 years ago

How do you consider the 'configurability' of the processor ?

I'm not sure it needs configuration: what about accepting any (first level) inner class as witness, as long as the inner class is not implementing a __* interface? But I could see how configuration can be useful: if a project want to ensure uniformity of witness naming, project-wide. But then the annotation cannot be on each specific class, so not sure how it would work.

Anyway, have nice holidays too!

let the apt generate the inheritance and the witness class automatically.

in the way of Lombook? I'd rather not modify the AST (it would not be compliant java anymore and it will need both javac and ecj support). but it could be an alternative to manually specifying the hkt boilerplate. To be noted for a potential future development.

clinuxrulz commented 8 years ago

goto at-least artillery 8?

Implementing Proxy from Pipes packages on hackage requires 6 type variables.

jbgi commented 8 years ago

I think the basis of the processor is done. Closing in favor of more focused issues.