vliejo / gitbook-plugin-local-plantuml

Gitbook plugin that renders plantuml images locally
Apache License 2.0
4 stars 12 forks source link

PlantUML is HTML escaped/encoded, so many UML diagrams fail to render #4

Open dannwebster opened 8 years ago

dannwebster commented 8 years ago

Issue Description

The text in a PlantUML "block" seems to be HTML escaped/encoded, so when the text is extracted from the block in the plugin, it is passed on to PlantUML changed from what was originally written.

It looks like the Gitbook framework automatically encodes the "block" text. This may not have happened for an ealier version of gitbook.

Steps to Reproduce

$ gitbook version
CLI version: 2.3.0
GitBook version: 3.2.0

If I write this in my .md document:

{% plantuml %} 
Class MyStage
Class Timeout {
    +constructor:function(cfg)
        +timeout:function(ctx)
        +overdue:function(ctx)
        +stage: Sage 
        +other_thing: Other Thing
}
Stage <|-- Timespent
{% endplantuml %}

This plugin tries to render this text:

Class MyStage
Class Timeout {
    +constructor:function\(cfg\)
        +timeout:function\(ctx\)
        +overdue:function\(ctx\)
        +stage: Sage 
        +other\_thing: Other Thing
}
Stage &lt;\|-- Timespent

That then generates this file bad-puml-render

The cause of issue is that for this text:

    +constructor:function(cfg)

The block object encodes the text (escaping parentheses, for example), and gives the plantuml.jar this:

    +constructor:function\(cfg\)

Similarly this Stage &lt;\|-- Timespent gets an html escape of the < character to a &lt; and becomes Stage &lt;\|-- Timespent

Details

Here is the problematic section of code (from gitbook-plugin-local-plantuml/index.js, lines 17-25):

module.exports = {
  blocks: {
    plantuml: {
      process: function (block) {

        var imageName = hashedImageName(block.body) + ".png";
        this.log.debug("using tempDir ", os.tmpdir());
        var imagePath = path.join(os.tmpdir(), imageName);
        var umlText = block.body;

The line

var umlText = block.body;

gets text that is already escaped. As far as I can tell, the block object does not contain any reference to the un-encoded text. It also does not contain any line-numbers, or any other way determine the text which generated the initial block.

The alternate Gitbook PlantUML plugin uses regexes to extract the text, which seems cumbersome, but might be the only way to get ahold of the unchanged text.

dannwebster commented 8 years ago

I was working on this issue in my fork, but have major problems getting the tests to run. I had to set the this.timeout(20000) to get them to run, and then they take forever. Suggestions?

bespechnost commented 8 years ago

I fixed this problem on my project https://github.com/vliejo/gitbook-plugin-local-plantuml/pull/5