devonfw / cobigen

Code-based Incremental Generator
Apache License 2.0
35 stars 72 forks source link

language agnostic templates #158

Closed hohwille closed 4 years ago

hohwille commented 8 years ago

Note: This is a visionary feature request that will cause a lot of effort and refactoring. However, I think that it would cause a huge benefit.

It would be awesome if the templates could be compatible with the language they produce. The template-engine could have a grammar configuration for each language that defines how the language syntax is used to express templating logic. E.g. for Java this would mean:

UPDATE: initially proposed $_varname_$ or __varname__ but due to strange problems with tools (eclipse, git) we changed to prevent problems.

The new variable syntax with underscores could also express the case-transformation (assuming variable names are then treated case-insensitive):

(Please note that I am intentionally mixing concepts of velocity and freemarker, because I think both have pros and cons but we need to create something new and better).

Supertype /* <#replace-left varType> */ $_var_$ = ...;
$_var_$.getFoo().getBar();

If the type is easy to extend you could also create a subclass X_VarType_X in the CobiGen templates that extends from Supertype and mark it as not being a template so the class itself will be ignored on generation. It would be nice to make some experiments and prototyping in this area to get more insights. The same concept would easily work with JS, .NET, PHP, python, ruby, scala, etc. Using this strange v_var_v approach might look ugly but it would be a very simple way to solve big problems. Assume you can refactor your CobiGen Templates with Java in a typesafe way...

What I like very much about velocity and miss a lot in freemarker is direct access to java calls:

hohwille commented 8 years ago

Please note that CobiGen could ship a minimal library for java intentionally containing a class V with public static constants such as pojo, etc. Further it could ship a class Macro with public static methods such out(Object o) and so forth. This would mean that inside a method block you could write something like:

int sum = 0;
for (Property property : V.pojo.properties) {
  if(property.getter != null) {
    Macro.out("sum +=" + property.getter.name + "();");
  }
}
return sum;

Then the CobiGen Java Plugin that parses the template could understand this as template macro and interpret it while the template can be processed by any java compiler and gets full IDE support with completion etc. So each statement-block operating on CobiGen specifc types would automatically get interpreted by the template-engine and would be executed and replaced with its output. Maybe there could also be a way to mark code inside as output as currently "sum" is a string (not typesafe) and mixing variables from non-marco code with macro code needs some clear rules and syntax.

hohwille commented 8 years ago

This is so damn cool. Give me some time and a room with food and drinks and let me implement this. I am dying for all this :) Death to freemarker!

hohwille commented 8 years ago

I did design the API. Have a look at this cool example template to get what I am dreaming of: https://github.com/hohwille/tools-cobigen/blob/dev_core/cobigen/cobigen-api/src/test/java/__rootpackage__/__component__/common/api/__detail__/to/__EntityName__Eto.java

This is fully compliant Java Code that compiles and can be refactored. It is IMHO quite expressive and readable. I would assume that you avoid 80% of the mistakes when writing templates for Java code and boost your productivity accordingly.

hohwille commented 8 years ago

A new challenge when implementing this in CobiGen will be that the templates defined as Java have to be unloaded and reloaded in its own classloader realm as they can frequently change. However, I assume that this already works today when CobiGen executes custom Java-Code from freemarker.

hohwille commented 8 years ago

One thing to clarify for __VarName__ syntax is how to intuitively express to presume the original case? I guess that __varnamE__ or something like this is rather sick. Using single underscores will not work as single underscores occur often in regular Java Code e.g. in CONSTANT_NAME_VARIABLES and we do not want to accidently get _NAME_ resolved here. Using double vs. tripple underscores is a pain to read and distinguish. __-varname-__ is also strange. For occurrences outside of Java member names it will also be the regular case to keep the original case. Therefore we need to find an intuitive short and simple syntax for that.

Currently I am out of ideas and maybe better go to bed...

hohwille commented 8 years ago

Just discovered that the following chars can be used in java identifiers: $ _ ¢ £ ¤ ¥ ؋ ৲ ৳ ৻ ૱ ௹ ฿ ៛ ‿ ⁀ ⁔ ₠ ₡ ₢ ₣ ₤ ₥ ₦ ₧ ₨ ₩ ₪ ₫ € ₭ ₮ ₯ ₰ ₱ ₲ ₳ ₴ ₵ ₶ ₷ ₸ ₹ ꠸ ﷼ ︳ ︴ ﹍ ﹎ ﹏ ﹩ $ _ ¢ £ ¥ ₩

hohwille commented 8 years ago

$varname$ - but how do you explain all the users how to enter such character via keyboard...

hohwille commented 7 years ago

We discussed about the design of this issue and came to the following conclusion:

hohwille commented 7 years ago

As my initial proposal code was unfortunately lost, I did a little coding from scratch with hopefully perfect design now: https://github.com/m-m-m/code/commit/f35282616f237e78831148634a75002008f281c9 Based on this we will be able to do source- and byte-code reflection as well as modification to the code including merging and writing the result to a defined location. The same API can also be implemented for typescript. I am also considering to drop qdox and simply generate a parser from the language grammar. Am I going too big here? Or is this cool? What do you think?

hohwille commented 6 years ago

Here is an update to the latest status.

I created a github project as sandbox for the vision how templates could look like with this new feature: https://github.com/oasp-forge/cobigen-language-agnostic-templates https://github.com/oasp-forge/cobigen-language-agnostic-templates/blob/master/cobigen-templates/src/main/java/x_rootpackage_x/x_component_x/common/api/x_detail_x/to/X_EntityName_XEto.java https://github.com/oasp-forge/cobigen-language-agnostic-templates/blob/master/cobigen-templates/src/main/java/x_rootpackage_x/x_component_x/logic/impl/X_Component_XImpl.java https://github.com/oasp-forge/cobigen-language-agnostic-templates/blob/master/cobigen-templates/src/main/java/x_rootpackage_x/x_component_x/logic/impl/x_detail_x/usecase/UcFindX_EntityName_XImpl.java https://github.com/oasp-forge/cobigen-language-agnostic-templates/blob/master/cobigen-templates/src/main/java/x_rootpackage_x/x_component_x/dataaccess/impl/x_detail_x/dao/X_EntityName_XDaoImpl.java You are invited to have a look, give feedback ("cool, when can I get this" or "OMG, I will stay with freemarker"), fork and improve, etc. This is the most important issue to get this design good and therefore I would love to get the best from the community. BTW: I had to switch from $_EntityName_$ to X_EntityName_X instead as otherwise several tools and languages are causing problems. It is technically working with $ as well so the code compiles but completion and refactoring then does not work properly (edge cases that are not tested by Eclipse & Co.). We do not want to fight against tools we want to utilize them. Hence it is IMHO the best solution to be pragmatic and use an even more "strange" syntax but this way we are 100% compliant as AsciiLetters and Underscores are pretty much accepted everywhere as identifiers. Still the symmetry of X_ and _X gives the minimum readability required to see the variable with a sharp eye and also is exotic enough to not collide with regular code (if someone has a regular class in his customer project that would match this `X_[a-Z]+_X" pattern please let me know, but I double we need to support cases where such a conflict can be realistic.

For the AST lib (mmm-code) I made great progress and is already working quite well (see linked JUnits and Types as demo):

Review feedback is most welcome. Crucial aspects therefore are:

hohwille commented 5 years ago

In our latest workshop we came to the conclusion to define an interface for marcros that will be implemented. In templates you just provide the class references to the template annotation. CobiGen would provide ready to use macros for generation of getters, setters, equals, hashCode, etc. but users could write their own marcos for advanced use-cases. Templates themselves will remain rather simple and therefore easy to read.

hohwille commented 5 years ago

After some more hacking I came to the conclusion that we should create an abstract class CobiGenMarco that macros need to implement. We should design this class stateful such that it is instanciated per use and then thrown away. This would give us the best design for simplicity and also future enhancements without breaking compatibility. So all parameters would simply be set via setters of the abstract parent class CobiGenMacro. The macro developer only needs to implement a single method

public abstract void generate();

The "parameters" are set before by CobiGen via the setters of the parent class and are thereby stored in protected members and hence easily accessible (alternatively via getters to even avoid collision with user defined members). With this design we can also avoid mistakes of accidentally stateful macros as we have them stateful and therefore simple by desgin.

hohwille commented 5 years ago

To take this even further think of that we would also provide UML models or potentially even OpenAPI contracts via our AST. Generating from an input type like a data model class from Java code, C# code, TypeScript code, UML, or OpenAPI could work using the same templates. Nice...

hohwille commented 5 years ago

During my experiments I came back to the point why it is better to introduce a variable type for dynamic type expressions instead of using a concrete type with an annotation containing the expression: Even if you put the actual expression in a constant you need to repeat the combination of annotation and type many times.

@CobiGenTemplate(...)
public class UcFind$_EntityName_$Impl implements UcFind$_EntityName_$ {
  @CobiGenDynamicType(condition="$_dataaccesstype_$ == 'dao'", type = $_EntityName_$Dao.class) $_EntityName_$Repository $_entityName_$Dao;

  @Inject
  public void set$_EntityName_$Dao(@CobiGenDynamicType(condition="$_dataaccesstype_$ == 'dao'", type = $_EntityName_$Dao.class) $_EntityName_$Repository $_entityName_$Dao) {
    this.$_entityName_$Dao = $_entityName_$Dao;
  }
  ...
}

If you do not always use the exact same combination it will not work as expected. With my suggestion the expression and type hierarchy is bundled in a single point of information.

@CobiGenTemplate(...)
public class UcFind$_EntityName_$Impl implements UcFind$_EntityName_$ {
  $$EntityNameDaoOrRepository $_entityName_$Dao;

  @Inject
  public void set$_EntityName_$Dao($$EntityNameDaoOrRepository $_entityName_$Dao) {
    this.$_entityName_$Dao = $_entityName_$Dao;
  }
  ...
}

and:

@CobiGenTypeExpression(...)
public interface $$EntityNameDaoOrRepository extends @CobiGenCondition("$_dataaccesstype_$ == 'dao'") $_EntityName_$Dao.class, @CobiGenCondition("$_dataaccesstype_$ != 'dao'") DefaultRepository<$_EntityName_$Entity> {
}

or maybe we can even combine both to allow the planned reuse but also having what you suggested:

@CobiGenTypeExpression(...)
public interface $$EntityNameDaoOrRepository extends @CobiGenDynamicType(condition="$_dataaccesstype_$ == 'dao'", type = $_EntityName_$Dao.class) DefaultRepository<$_EntityName_$Entity> {
}

Please also note that @CobiGenDynamicType could also have a parameter method suffix that could be set to "<$EntityName$Entity>" but if omitted it would default to the generic declaration of the substituted type.

I suggested to use a syntax like $$EntityNameDaoOrRepository instead of $_EntityName_$DaoOrRepository to better distinguish templates and type expressions and also simplify the elimination of imports. However, with mmm-code we can even go for a clean approach and remove all imports and then re-generate them after the template is fully resolved. The feature is already implemented: https://github.com/m-m-m/code/issues/10

hohwille commented 5 years ago

Still I am not finally convinced regarding our variable syntax. IMHO it is readable but however, IDE support is not working fine this way. Eclipse often messes up imports or fails with auto-completion. Maybe we should come back to original suggestions like __EntityName__ or X_EntityName_X as originally suggested. Maybe that would be even more agnostic as I did not test $ in all occurences of XML, JSON, C#, Kotlin, etc. We can change now or never... WDYT?

hohwille commented 5 years ago

I still like __EntityName__ syntax and tested it on my mac with git having no issues. Can someone test this again on windows as we once had issues with files and folders of such names in git on windows (long time ago).

hohwille commented 5 years ago

This is the outcome of Eclipse auto-completion import with current syntax:

import $
import ._rootpackage_..general.common.api.ApplicationEntity;

should have been

import $_rootpackage_$.general.common.api.ApplicationEntity;

We could raise a bug in Eclipse but chances that this gets fixed in the next 20 years are very low.

jdiazgon commented 5 years ago

Can someone test this again on windows

I have changed a folder name to _utils_, and a file name to _JavaUtil_. You can see the file here. Seems to work well.

In my opinion, I prefer $EntityName$ rather than __EntityName__ as the first syntax looks more like a templates language , and also is more readable (at my first test I didn't notice that it uses two underscores _ rather than one ).

However, if $ doesn't work well, then for me it's okay to change the syntax. Completely understandable.

hohwille commented 5 years ago

Definetly we have to get away from $_rootpackage_$ syntax. Refactoring and code-completion is simply in Eclipse when using dollars: Screenshot 2019-05-17 at 12 52 47

hohwille commented 5 years ago

Bad thing about __VariableName__ is that the prefix and suffix is symmetric: __VariableName____VariableName2__ is not to great to read. Do I have 4 underscores in the middle (correct) or just 3 or maybe 5 (error)?

jdiazgon commented 5 years ago

is not to great to read.

Seems better idea to use another naming convention like x_variableName_x

maybeec commented 5 years ago

I would even put a vote for xvariableName

Cases:

  1. xvariablename
  2. x_var1_x_var2
  3. x_var1__whatEver
  4. Something_x_var1

I aggree it's not the best readable, but anyhow, for a first go it could be imaginable to do it with x_var_x or even just with the first x as an indicator. We might even want to provide something like the dollar way by configuration as it might be just an issue with eclipse. If there is anybody writing his templates in any other editor. Compiler might be not an issue anymore.

From my point of view, the language agnostic templates simply have a lack of readability out of the box. It would be nice, if the placeholders could be highlighted in an editor in future.

hohwille commented 5 years ago

Thanks for your feedback! IMHO we should agree on one syntax. Of course it is easy to technically make this configurable but we need to maintain templates and want to standardize with devon. Allowing every project to use its own syntax would IMHO cause more harm than value.

I would even put a vote for xvariableName

This does not work for scenarios where underscore is part of the case syntax as then the variable syntax is not properly terminated and we do not know where is the end of the variable name:

public static String X_VARIABLE_NAME_SUFFIX = "bla";

So is _SUFFIX a static suffix or part of the variable to substiture here?

So can we finally agree:

The really great thing is that this syntax is perfectly aligned with any language including XML and JSON as it only contains common characters allowed in regular places. There is a very low but still existing possibility of a collision that someone wants to have a literal sex_and_xylophone for whatever reason in his templates but if we also agree that the variable syntax remains untouched if the variable is not defined we should be fine.

hohwille commented 5 years ago

Seems as if I knew it before:

hohwille commented 5 years ago

Here is the code in my fork: https://github.com/hohwille/tools-cobigen/tree/dev_core/cobigen/cobigen-language-agnostic

There is one remaining case I did not yet solve and where some discussion is needed:

When we need dynamic code inside a method body that can not be generated by a macro. Example: https://github.com/hohwille/tools-cobigen/blob/dev_core/cobigen/cobigen-language-agnostic/la-templates-devon4j-crud/src/main/java/x_rootpackage_x/x_component_x/dataaccess/api/x_detail_x/repo/X_EntityName_XRepository.java

I see the following options:

  1. We define a syntax like freemarker that is text base (using comments) with loop, if, etc.
  2. We define such syntax in the target languge like this:

    for (CodeType $pojo : Collections.<CodeType> emptyList()) {
      for (CodeProperty $property : $pojo.getProperties()) {
    
        @CobiGenOutput
        Object type = $property.getType();
        Object propertyName = $property.getName();
        {
          $Type $propertyName = criteria.get$PropertyName();
        }
      }
    }
  3. We can go for @CobiGenLoop annotation and loop the following block/statement, etc.
  4. We can entirely avoid to solve this. In the example we can go for #858 and we simple get rid of this propblem.
  5. We can go to my initial proposal and allow something similar to a macro that has a method for generation that returns either a String (any lines of code) or a CodeStatement and can dynamically generate the output. By putting CobiGenGeneratorMySpecialCase.do(); in the template this generator will be instanciated, evaluated and this statement-line is then replaced with the generated output.

I am fine with 4. but I this is just one example for a situation that will occur in other situations so we still need a real solution for the general problem. I vote against 2. and 3. as they are kind of ugly and require a lot of hard work to parse all the details of the language syntax inside the implementation bodies (what we otherwise do not need). We could do that but it will be several months of extra work and maintaining this for all programming languages might be a nightmare. So I would go for 1. or 5. As 1. is where we are comming from and is again not type-safe and falling back to text-editor levels I would vote for 5.

WDYT? Maybe you have other ideas I am not seeing or new convincing arguments...

jdiazgon commented 5 years ago

I have been looking into the templates, they look very good. Seems that we have a fancy templates language :)

Just some remarks:

Related to dynamic code:

Again here I'm a little bit afraid that Macros will hide too much the implementation, and that a CobiGen developer would miss seeing the direct generated output.

In my opinion, I would go for option 1. Maybe I'm not being quite objective with that because I'm already used to FreeMarker and for me it is easy to implement that kind of cases.

Also we need to remember that these kind of cases will be infrequent, so type-safety won't be such a big deal.

Ideally we could go for option 5 if we find a way to define Macros more clear.

hohwille commented 5 years ago

@jdiazgon thanks for your feedback.

The difference between x_rootpackage_x and X_Rootpackage_X is that the variable will be printed with the first letter capitalized?

Exactly. See the top-level comment of this issue.

I have been looking into the Macros and so far I have seen that some of them hide too much the implementation. For example, Macro CobiGenMacroGetterSetters just contains one line getOutputType().createGettersAndSetters();

I actually thought it is great to have as little code as required. If the name is obvious why should there be more code? We can of course redundantly implement what mmm-code already offers with this method inside the CobiGenMacro but what for?

I agree that this makes the code look much cleaner, but at the same time it seems to me, as a CobiGen templates developer, that I'm not able to change the way getters and setters are generated.

Surely you can. Have you seen the more specific variants that can generate the getters/setters for ETO/BO-Interface? This is exactly demonstrating how the template developer can customize. IMHO, we would need good documentation and examples for Macros instead of questioning if one line of code is too little.

A dummy use case could be, how can I change the order in which getters and setters get generated, so that setters get generated before getters? Would I need to modify the mmm-code library? In this case, FreeMarker seems more flexible.

  1. IMHO code-style should not be addressed by the template developer. That is currently actualy a big desgin flaw of cobigen. To change the code-style you need to customize the templates. With eclipse there is somehow a workaround but this will become more of an issue with CLI generation. The idea of mmm-code is that the code-style can be configured when the code is written to String/Writer/OutputStream/File. Even though not yet greatly supported but you can already configure indendation, etc.

  2. Without documentation some questions like this may indeed show up, so thanks for the feedback. However, how can anything be more flexible than Java? You can do whatever crazy stuff you want with macros.

Again here I'm a little bit afraid that Macros will hide too much the implementation, and that a CobiGen developer would miss seeing the direct generated output.

Ever tried to debug freemarker? Set a breakpoint in a freemarker template? (I know you can with some exotic custom plugins, but it sucks). With language agnostic templates and cobigen CLI you can OOTB. Just set a breakpoint in your macro and run cobigen in debug mode.

In my opinion, I would go for option 1. Maybe I'm not being quite objective with that because I'm already used to FreeMarker and for me it is easy to implement that kind of cases.

I am fine to additionally support something like this what could be helpful for generating XML, JSON, Properties, or excotic languages we do not support in such a structured way. However, I might also be biased but I would really suppose to focus on "real programming for the tricky suff". In 20 years of IT experience I have learned that we always failed with approaches where we left the language ecosystem:

Just to give you some examples. How do you write JUnits? How do you debug and set breakpoints? Where is the documentation for that "new language"? How do I refactor and maintain that when the software is evolving? Where is code-completion / content-assist? Where is the compiler feedback to see if my code is actually syntactically correct? All these approaches died exactly because of such problems.

Also we need to remember that these kind of cases will be infrequent, so type-safety won't be such a big deal.

Yes and no. Have a look at the current devon4j templates. They are badly maintained and paritally even buggy. Ask some newbe to improve the freemarker macros or fix something there. I did this for many developers outside the cobigen ecosystem. All of them gave up immediately. If we stay in the Java ecosystem and offer JUnits for Macros, etc. we can go to the next level. Of course I am aware that I argue from a Java developer PoV. A C# developer would say the exact opposite. Instead of learning Java they would probably prefer writing in a limited templating language. So what I am saying is that we should keep going further to develop the options and compare pros and cons. We might come to a point to take a final decision because pros are convincing for that option or we might want to provide two alternatives in parallel (what I typically dislike but lets see). Also what is already for sure: We will not drop backwards compatibility with cobigen. So you can still use freemarker and ignore language-agnostic templates if you do not like the approach.

hohwille commented 5 years ago

I observed another topic that we are poorly solving with our current CobiGen features:

We very often have the necessity to derive some complex detail from the input type or one of its properties.

To solve this we use tricky expressions like some of these:

In the worst case we repeat the same expression many times within the same templates or macros. In some better cases we assign the result to a variable and then use that variable instead.

My suggestion would be a new construct that allows to extend variables based on the existing ones. So like a macro a specific class could be implemented and registered via templates configuration that can calculate new variables based on the existing ones. Ideally cobigen API would allow to do that lazily so the new variables are only evaluated once they are resolved for the first time for a template. This way I could not only refer to ${entityName} or x_entityName_x but also e.g. to ${tableName} (X_TABLE_NAME_X). IMHO this would make templates much more readable and maintainable as currently this seems to be a big drawback (esp. whenever I have to modify functions.ftl what seems like a nightmare to me.

Doing the same for "iterated variables" would be more tricky. However, like already shown this could be solved with macros in this language-agnostic-templates experiments. Also with the classical approach it would be possible to provide a new variable etoFields derived from pojo.fields.

github-actions[bot] commented 5 years ago

Stale topic. Please negotiate closing or discussing the relevance of this ticket.