SeelabFhdo / lemma

Home of the Language Ecosystem for Modeling Microservice Architecture (LEMMA)
MIT License
33 stars 8 forks source link

Use maven-dependency-plugin to download "library" dependencies required for Eclipse plugin development at build time #45

Closed josor001 closed 10 months ago

josor001 commented 2 years ago

Hi folks,

as you might know because LEMMA is build through various Eclipse Plug-Ins we have a problem with libraries/dependencies which are not available as Eclipse Plug-Ins. A typical example can be found in the Avro Plugin ( de.fhdo.lemma.data.avro ). The Plug-In uses avro, jackson-core, jackson-annotations, and jackson-databind as external libraries.

Unfortunately we cannot use mavens dependency management directly because LEMMA projects are Eclipse Plug-In. Therefore, we currently manually download and add those libraries, e.g., in a lib folder. However, - and I guess we all agree here - this is super tedious. For example the openapi library which I use in the OpenAPI branch in order to enable openapi-2-LEMMA transformations needs 20+ libraries.

Fortunately, I think I was able to find a better solution to get rid of the tedious parts. We can use the maven-dependency-plugin and the copy-dependencies to automatically download all dependencies during the maven initialize phase of our tycho build. Then, we add the location of the downloaded libs to the bundle classpath in the manifest file. To deal with new versions of a library we can even strip the version number from the downloaded files, i.e. avro 1.10.0 and avro 1.11.0 would both be stored as avro.jar. By default the lib jars are downloaded to the folder target/dependencies. But we could also configure a different folder.

The only drawback is that when you initially clone the source code and import it into your IDE without executing the maven initialize phase your project will have an error because the lib jar which the classpath points to is not found. Also, this only happens when we use the default target/dependendies location. When we store the lib files in a lib folder and include them in the commit we can even skip the initial errors in the IDE. However, then we still have to commit all lib files ...not sure if we want that cause those files blow the repository size up by quiet a lot (e.g. my OpenAPI module is 100+MB).

I already did a testbuild in the mvnDependencyTest branch. The interesting project is de.fhdo.lemma.technology.technologydsl.experimental . As you can see I added apache commons math3 as an example dependency in the pom file.

What do you guys think? Feasible approach?

frademacher commented 2 years ago

Thanks for the suggestion and its elaborate explanation!

I agree on the suggested approach would like to see it for the OpenAPI module as sort of a "testbed" for all other LEMMA modules (like the Avro plugin). Indeed, we should refrain from blowing up the repository even more and I don't think that it is a huge problem that there might be errors in the Eclipse workspaces of new people. In fact, we should just document that they have to invoke an initial LEMMA build first to setup their development hardware. So no objections concerning this issue from my side. Even more, it would deliver us a good reason to finally remove generated code (like Java files originating from Xtend files) from the repository.

I just have a few things to consider and maybe you can sort them out when integrating the OpenAPI module with the maven-dependency-plugin:

josor001 commented 2 years ago

The first build for the commit failed because I shortened the build stages (wanted quick results, skipped everything for actual publishing) and forgot to get rid of some parts of the updatesite, i.e. failure was not related to the mvn-depdency-plugin . Regarding the lib folder, In the avro plugin we had the lib folder because we needed/wanted to commit all the libs. If we aim to NOT upload those jar files, we need to either include the lib-folder into the gitignore file or put the lib folder in the build folder (target/lib), right?

frademacher commented 2 years ago

Concerning the failed build: I see, that makes sense. And I guess we will see possible build failures related to the dependency plugin (or the lack of them; in general, I would expect them to not occur) as you push the OpenAPI module branch.

Concerning the lib folder: As far as I understood we need to make sure that the folder either exists in the repository (and is empty, i.e., it should contain only a .gitkeep file and .gitignore should be taught to ignore *.jar files in lib folders) or that the dependency plugin creates it automatically when it does not exist. In any case, Eclipse build.properties files must point to the dependencies in the lib folder, even if they don't exist in the repository. If we download dependencies to Maven's target folder, a Maven clean may accidentally delete them. Moreover, the CI pipeline must be able to find them (which should be guaranteed as you wrote that the dependency plugin runs in Maven's initialize phase).

Additional remarks that just came to my mind:

josor001 commented 2 years ago

I like the idea of an external-lib folder. As far as I unterstand the maven-depdency-plugin has various configuration options. There are options to address transitive dependencies, to exclude group Ids, and also for scopes. Therefore, I think it should be possible to come up with an efficient way for dependency management. As you proposed, I will use the OpenAPI module as a prototype for that kind of management. Regarding build failures... I am certain you will see some more of them coming from the new OpenAPI branch in the near future ;)

I will report back when I have good or bad news 👍

josor001 commented 2 years ago

Took me a while to come back to this and start the development on the openapi transformation. Just wanted to keep you guys updated because I imagine that maven dependency support in the build is desired by some of you. Turns out my initial approach using the maven-dependency-plugin was not feasible. It worked great for the small toy example which I provided. However, in reality, e.g., in the Avro or the upcoming OpenAPI transformation plugin projects, we do not exclusively use the <dependency> section in the pom file for external dependencies but also for internal ones, e.g. the Avro project depends on LEMMAs datadsl projects. Using the copy-dependencies goal resulted in a copy of ALL dependencies, i.e., the folder also contained another copy of all LEMMA libs as jar files. It is possible to only include explicitly mentioned artefacts, however, than the maven plugin does not copy transitive dependencies.

Long story short, I had to look for another solution. Fortunately I discovered another maven plugin called maven-assembly-plugin and after some configuration issues it does the trick :) My local build works just fine, you guys can already check out the magic in the openapi-refactor branch (will push in the couple of minutes; doin another local build just to be sure). In summary, it works but ofc still feels like a workaround.

frademacher commented 2 years ago

Thanks a lot for further tinkering with this and it seems like a sensible solution to me. However, I propose to also remove the dependencies' JAR files from the repository when they are automatically retrievable by maven-assembly-plugin, and actually relevant only to the build and deployment processes of LEMMA's CI/CD chain.

frademacher commented 2 years ago

In fact, the whole lib folder should probably be added to .gitignore.

frademacher commented 1 year ago

I guess this issue was solved by the OpenAPI-to-LEMMA transformation that just landed in main (cf. PR https://github.com/SeelabFhdo/lemma/pull/65). Would be nice @josor001 if you could briefly elaborate how you finally tackled the download of library deps before closing this issue.

josor001 commented 10 months ago

Sorry everybody, time flew by and I totally forgot to actually close this issue. Just a quick summary on how the OpenAPI build now actually works.

The original issue how this all started was the problem of resolving maven-defined dependencies (pom file) within the manifest-first eclipse build that we use with LEMMA. Many dependencies that are available through maven are and were not available as plug-ins in the eclipse cosmos. The "traditional" way of solving this would either be to just provide the library in the lib folder and actually add the jars to the project as it is done within the lemma avro plug-in (see here). While this is a totally fine and very robust solution, we have to check in every dependency, blowing up the size of LEMMA, and updating the libs also becomes a hassle. This especially becomes bothersome, when plug-ins such as the OpenAPI Transformation require a lot of external libraries, e.g., to load and parse OpenAPI documents.

the solution I came up with uses the maven-assembly-plugin. What I did was basically define the dependencies in the pom-file as you would normally do with maven-first builds (see here). I then configured the assembly plug-in to hook into the maven validate phase (see here) and download as well as copy into a lib folder every dependency named in an additional created assembly/assembly.xml file. The complete lib folder is added to the classpath. In conclusion, maven dependencies need to be added to the pom-file as well as to the assembly-file. They are automatically downloaded and added to the lib-folder during the maven validate stage. The content of the lib-folder is added to .gitignore, i.e., we are not checking in the used libraries but download them again every validate stage.

It works and does not blow up the size of the git-managed codebase, however, it is everything but pretty. According to my research, there might be a better solution using Tycho 2.7.0. As of 2.7.0, tycho officially supports mixed builds and they tuned the pom-dependencies-consider option. However, I currently do not have the time to investigate this further. And we would need a lift of the currently used Tycho 2.3.0 to 2.7.0 for every LEMMA eclipse plug-in.