asciidoctor / asciidoctor-diagram

:left_right_arrow: Asciidoctor diagram extension, with support for AsciiToSVG, BlockDiag (BlockDiag, SeqDiag, ActDiag, NwDiag), Ditaa, Erd, GraphViz, Mermaid, Msc, PlantUML, Shaape, SvgBob, Syntrax, UMLet, Vega, Vega-Lite and WaveDrom.
http://asciidoctor.org
MIT License
431 stars 106 forks source link

Unbundle dependencies #67

Open pepijnve opened 9 years ago

pepijnve commented 9 years ago

Currently plantuml and ditaa are packaged inside the gem, but shaape, blockdiag, etc are not. I think we should consider taking plantuml and ditaa out of the gem and treat them like the other external tool dependencies. As a convenience a small bit of code could be added to asciidoctor-diagram to download the jars from a maven repo. Additionally new document attributes could be introduced that allow the default version numbers to be overridden which allows users to use more recent versions of plantuml. There are two benefits to this change:

pepijnve commented 9 years ago

@mojavelinux would love to hear your opinion on this...

mojavelinux commented 9 years ago

I like the idea of unbundling. I also like the idea of an install subroutine inside the processor.

We could also consider the possibility of creating a wrapper RubyGem that handles the installation, much like the kindlegen RubyGem works (it uses the extconf.rb script to download). See https://github.com/tdtds/kindlegen/blob/master/ext/kindlegen/extconf.rb This would make it useful for anyone using plantuml from Ruby, not just the Asciidoctor Diagram project.

pepijnve commented 9 years ago

The direction I was thinking about was to use each dependencies native package manager to download. So for plantuml and ditaa that would be maven and we download to ~/.m2, mermaid (when support for that lands) would be installed via npm, blockdiag and co and shaape via pip (or easy_install). That way we can pick up those dependencies no matter how they're installed.

The question that remains is if we want to let asciidoctor-diagram do this for users, or require that the dependencies are installed externally. Not sure how ruby users feel towards being forced to install maven just to do download a jar file. For maven specifically, it's also a bit cumbersome to get it to download an artifact so that could be simplified a bit.

mojavelinux commented 9 years ago

Sounds reasonable. For getting Java dependencies, I recommend using Gradle over Maven. Gradle can be embedded using the Gradle Wrapper and can be used as an external tool in order to download a dependency much easier.

I don't think users care so much as long as the instructions are clear. People are willing to go to great length with clear instructions ;) It's the unknown that is scary.

mojavelinux commented 7 years ago

I wonder if on the Ruby side we could redistribute plantuml and ditaa using gems. That way, it's still possible to gem install all the necessary components. Those gems could use the extconf.rb approach I described above so that the binaries are downloaded at the time of gem installation. The kindlegen gem does exactly that.

hsanson commented 7 years ago

I think this gem should only provide mechanisms to automatically detect the availability of the external tools it uses and enable/disable the generation of the different diagram types based on such availability. There are various reasons for this:

  1. The external tools already have their own mechanisms for installation and configuration. We only need to provide additional documentation with details on how to install the tools on popular distributions (e.g. Ubuntu, Debian, Fedora. CentOS, etc) and operating systems (MacOS, Windows).

  2. Normally users of this gem only want support for one or two of the diagram types. Currently this gem forces plantuml and ditaa on them even if they do not use such diagrams.

  3. Other projects are not willing to accept a gem that installs a 5MB jar file as dependency making it difficult to get pull requests that contain this gem accepted (See gitlab for an example).

  4. Providing special gems to install plantuml and ditaa it is possible that users will demand/request that gems for all the other tools be created too. Why only plantuml and ditaa should get special treatment?

mojavelinux commented 7 years ago

Yes, I'm essentially suggesting that such installation gems would be provided for any open source diagramming tool we integrate with. (I'd rather not support proprietary tools, though it should be possible to extend Asciidoctor Diagram to add that support).

You're really going to have two camps of people. One camp doesn't want bundled dependencies and instead want to (or must) install the libraries on their own. The other camp won't use the tool unless the dependencies are handled automatically. Neither camp is more right than the other. It's just a different perspective / different needs. I think it's reasonable to accommodate both camps.

(Whatever we do, we need to make sure it plays well with AsciidoctorJ).

I think this gem should only provide mechanisms to automatically detect the availability of the external tools it uses and enable/disable the generation of the different diagram types based on such availability.

That certainly seems reasonable to me.

pepijnve commented 7 years ago

Looks like a good start. There are a couple of things to be careful about though.

hsanson commented 7 years ago

Is there any reason why these two tools, ditaa and plantuml, have to be called directly from Java? Ditaa has a command line tool and can be invoked the same way the other tools are and plantuml can use an external service to generate the diagrams.

Actually the idea of including plantuml jar into gollum was rejected by the gollum authors but they had no issues with an external service (See this PR).

Modifying this gem so ditaa uses external command line tool and plantuml uses external web service will address most, if not all, the points raised above:

mojavelinux commented 7 years ago

Using a commandline tool in a library is generally considered a bad practice. Using a formal process-to-process communication is safer, more extensible, and more professional.

mojavelinux commented 7 years ago

I'm really concerned that you are only considering the *nix use case. We have to think about the Java consumer (AsciidoctorJ) and the Windows user. Using commandline tools is very unfriendly for those environments. We must remain as agnostic as possible, esp since a large segment of our users are using Java on Windows.

mojavelinux commented 7 years ago

Having said that, if you want to create an Asciidoctor Diagram extension (separate gem) that uses only commandline tools, I certainly encourage you to create that extension. I have no doubt that there is value to that approach for an audience that uses a managed *nix environment.

pepijnve commented 7 years ago

@hsanson there's no technical reason why you couldn't make things work the way you're describing, and it's definitely a useful alternative. It's just not the use case I designed the extension for at the time. I needed something that could run standalone on a CI server without having to talk to external services.

The reason for the special construct for Java based tools is that the JVM takes a prohibitively long time to start up. Spawning a new JVM instance per diagram was not practically usable for documents with many diagrams. Initially I was using RJB to interact with a JVM instance, but that proved to be very difficult to use on Windows and macOS. In particular macOS had issues with the JVM launching API where it would always complain about Java 6 not being installed. To simplify all of this I made an extremely minimal HTTP-based diagram conversion daemon that gets launched using the JRE's CLI.

For this issue specifically (and the corresponding PR) I would suggest that we stick to figuring out how to find (and maybe install) the JAR files externally and keep everything else identical. Using Ditaa's CLI interface or an external PlantUML service is something that we can tackle in a separate PR.

jackorp commented 5 years ago

Ping.. Are we any closer to resolving this? I am trying to package the asciidoctor-diagram into Fedora but the bundled jars are making things more difficult.

mojavelinux commented 5 years ago

One possible solution to this is to continue bundling the jars in the gem, but allow the code to work if they are provided externally. That way, the rpm package can ship without the jars and the code will gracefully fallback to looking on the system for them somewhere. Perhaps an environment variable could be involved.

jackorp commented 5 years ago

I like that solution, it will at least be a step in the right direction IMHO.. Environment variable like ENV['RPM'] could be involved What I thought about doing is link required packages for graph creation (ditaa, ...) into the lib. More trouble will be with packages that were created for asciidoctor-diagram e.g. server.jar which do not have rpm counterpart AFAIK.

pepijnve commented 5 years ago

More trouble will be with packages that were created for asciidoctor-diagram e.g. server.jar which do not have rpm counterpart AFAIK.

Those are an intrinsic part of asciidoctor-diagram. Why would you want to split those off?

mojavelinux commented 5 years ago

I think we would need an rpm package like asciidoctor-diagram-java or perhaps asciidoctor-diagram-server. Here's the repo: https://github.com/asciidoctor/asciidoctor-diagram-java

pepijnve commented 5 years ago

@mojavelinux Not sure what the point of that would be since they always evolve in lockstep. It would shrink the gem a teeny tiny bit, but you're trading a marginal size reduction for an increase in complexity. Doesn't feel like a worthwhile tradeoff to me. The server really isn't designed to be a standalone component at the moment; it's a tiny daemon that exists as a workaround for JVM startup performance. That being said, if there are strict Fedora/RedHat guidelines on RPM design that we need to stick to I'm happy to reconsider.

Environment variable like ENV['RPM'] could be involved

@jackorp I'm a bit reluctant to add RPM specific bits to the code; what about Windows, Debian, Arch, Gentoo, ...? I would prefer something platform agnostic like ENV['PLANTUML_HOME']. Would that work for you?

What I thought about doing is link required packages for graph creation (ditaa, ...) into the lib.

Ditaa's a tricky one. The bundled version of Ditaa is from https://github.com/pepijnve/ditaa. This started as a minimalist version of Ditaa, but at this point is probably better considered as a fork. Upstream Ditaa looks like it's being rewritten in Clojure, but after a short burst of activity seems to have stalled again. I might be able to make the Java glue code compatible with upstream Ditaa, but I'm not going to spend the effort to try and get my bugfixes upstreamed.

pepijnve commented 5 years ago

@mojavelinux thinking thinking thinking 😄 I'm a bit worried about the impact of these changes on existing deployments. We could deal with breaking changes through SemVer, but personally I don't think that's very kind to existing users.

What we could do is the following gem split/reorganisation:

The only thing I'm not entirely sure about yet is how we deal with CLI/server vs bundled for PlantUML and Ditaa. Separate diagram type? Or make one override the other? I.e., do conditional loading where we try to load the 'bundled' implementation and if that fails fall back to the CLI/server one?

jackorp commented 5 years ago

The packaging guidelines strictly forbid pre-built deps. For reference, see java packaging guidelines, there must be none pre-built *.class and *.jar included in package from not only legal reasons.

The package split looks interesting, if the core would be composed of ruby only and then use user provided extensions, that would be neat solution IMHO.

@pepijnve abouth the env, it was just an example I saw, what you're proposing sounds also good.

mojavelinux commented 5 years ago

My comment is going to take two paths.

First path: It's important to understand that the RPM isn't built from the published gem. Instead, it's built from source. Think of the RPM as a separate deployment of the gem using the same source. That's why I pointed to https://github.com/asciidoctor/asciidoctor-diagram-java. It's not that you'd use that independently. But often times the packaging group will create an auxiliary RPM just for the purpose of unbundling. Then, the RPM wires everything together under the covers using patches to create the master package. We don't have to worry too much about those patches. While there are things we can do to make the packager's job easier, they can often work with the source the way it is.

Second path: The idea of separating the gem into individual pieces is certainly intriguing. I like how you're thinking about a gem that operates with no deps (though I'd propose a name like aciidoctor-diagram-lite for this instead of asciidoctor-diagram-core). This seems like a pretty natural evolution of Asciidoctor Diagram to avoid becoming too bulky. It also makes it even easier for the community to create gems to support other diagramming tools outside the official project.

For the plantuml and ditaa packages, I definitely think bundling is the right approach for the users of the gem (from rubygems.org). But I think the code should also allow this location to be overridden, either using an environment variable (that can be passed in), a constant (than can be patched), or a resolver (that can be implemented). That way, packagers have an exit hatch to sidestep using the bundled version (perhaps routing it to a location on the system instead).

jackorp commented 5 years ago

I personally like the second path described.

I would like to correct one thing though:

My comment is going to take two paths.

First path: It's important to understand that the RPM isn't built from the published gem. Instead, it's built from source. Think of the RPM as a separate deployment of the gem using the same source.

Rubygem RPMs are indeed built from gems published at rubygems.org (we even have a tool generating most of spec file from gem: gem2rpm).There may be adjustments to conform to the packaging guidelines, but the deviation from upstream should be is little as possible.

mojavelinux commented 5 years ago

Rubygem RPMs are indeed built from gems published at rubygems.org (we even have a tool generating most of spec file from gem: gem2rpm).There may be adjustments to conform to the packaging guidelines, but the deviation from upstream should be is little as possible.

This is a change from when I first made the RPM package for Asciidoctor. Though I still think the spirit of my point is accurate. While gem2rpm may grab the gem from rubygems.org, it's only doing so to get at the source code, which it then repackages. In other words, the RPM doesn't bundle the gem as is. It extracts it and takes files out of it. And that is where patches can (and this case, must be) applied.

Regardless of where the ditaa integration gem ends up, that gem will contain the ditaa jar. But the RPM package will need to ignore that jar and persuade Asciidoctor Diagram (however we decide) to use a jar file from the system (installed by another package). The RPM should not be obligated to use the jar file from the gem (which is intended for a different audience).

jackorp commented 5 years ago

Yes, the gem gets pulled in as both archive that has the code and also, the gemspec has most of information necessary for stuff like "Authors" or "License".

As I mentioned, current state is not ideal but also not impossible to package. It would be easier with suggestions you made e.g. a constant.

Z13NDELS commented 2 years ago

Is there any news on that point ? Can I help in any way ? I'm currently having troubles because the embedded version of PlantUML is aging (see my question on StackOverflow). It would be cool to be able to feed asciidoctor-diagram with a PlantUML JAR using something other than environment, which don't work in asciidoctor-maven-plugin (but I understand it's a side-issue to this one)