asciidoctor / asciidoctor-diagram

:left_right_arrow: Asciidoctor diagram extension, with support for AsciiToSVG, BlockDiag (BlockDiag, SeqDiag, ActDiag, NwDiag), Ditaa, Erd, GraphViz, Mermaid, Msc, PlantUML, Shaape, SvgBob, Syntrax, UMLet, Vega, Vega-Lite and WaveDrom.
http://asciidoctor.org
MIT License
445 stars 109 forks source link

Cannot import puml files into adoc files when puml files include multi byte characters #428

Closed avenue68 closed 5 months ago

avenue68 commented 1 year ago

I'm using asciidoctor gradle plugin and I'm trying to import .puml files into a .adoc file but as I mentioned in the title, .puml files are not imported correctly when they include multi byte characters in them. But the PlantUML diagrams are rendered properly even they include multi byte characters when they are written directly on .adoc files.

// this doesn't have any problems. [plantuml] .... :あ: --> (い) ....


- The exposed error message

unable to render AsciiDoc document

org.asciidoctor.jruby.internal.AsciidoctorCoreException: org.jruby.exceptions.EncodingError$InvalidByteSequenceError: (InvalidByteSequenceError) asciidoctor: FAILED: C:\path\to.adoc: Failed to load AsciiDoc document - "\x82" followed by ":" on Windows-31J


- build.gradle
```groovy
plugins {
  id 'org.asciidoctor.jvm.convert' version '4.0.0-alpha.1'
}

repositories {
  mavenCentral()
}

asciidoctor {
  baseDir 'src/docs/asciidoc'
  sources 'use_cases.adoc'
}

asciidoctorj {
  modules {
    diagram.use()
    diagram.version "2.2.10"
  }
}

Any ideas?

pepijnve commented 1 year ago

When the diagram code is embedded in the adoc file it's read by asciidoctor itself. Asciidoctor always assumes UTF-8 encoding if I recall correctly. When using the block macro syntax the diagram code is read by calling File.readlines. MRI has been UTF-8 by default since 2.0. JRuby seems to be inheriting the default behaviour from the JDK which was platform dependent up to Java 18. That or grade is setting it. The external file seems to be incorrectly getting read using the Windows-31J encoding and that's causing the error.

I'll change the extension to explicitly use UTF-8 when reading external files so that you get the same behaviour in both cases. As a workaround, you could try adding -Dfile.encoding=UTF-8 to your gradle invocation and see if that resolves the issue.

avenue68 commented 1 year ago

Thanks for your explanation! Adding -Dfile.encoding=UTF-8 resolved the issue!