docToolchain / docToolchain

a AsciiDoc Toolchain for technical Software Documentation, focused on Software Architecture Documentation
https://doctoolchain.github.io/docToolchain/
MIT License
739 stars 235 forks source link

generateHTML does not copy images directory #506

Closed softmetz closed 1 year ago

softmetz commented 3 years ago

The task generateHTML does not copy over the contents of the taskInputsDirs, which leads to images missing in the output.

Adding :data-uri: in config.adoc has no effect either.

On contrary PDF files created with generatePDF carry all images. The same is true for generateHTML inside fresh checkout of docToolchain where the images is copied below build.

To Reproduce Steps to reproduce the behavior:

  1. Add as submodule at docToolchain as in https://docs-as-co.de/getstarted/tutorial2
  2. extract ZIP of arc42 EN in src/arc42
  3. Adapt Config.groovy accordingly
  4. Run ./gradlew generateHTML
  5. Look for images folder in build/docs/html5 -> Its missing

Expected behavior The folder images exists below build/docs/html5

Screenshots

build
└── docs
    ├── html5
    │   └── arc42
    │       └── omkring-architecture-description.html
    └── pdf
        └── arc42
            └── omkring-architecture-description.pdf
src/
├── arc42
│   ├── images
│   │   ├── arc42-logo.png
│   │   ├── arc42-logo.png.license
│   │   ├── omkring-logo.svg
│   │   └── omkring-logo.svg.license
│   ├── omkring-architecture-description.adoc
│   └── src
│       ├── 01_introduction_and_goals.adoc
│       ├── 02_architecture_constraints.adoc
│       ├── 03_system_scope_and_context.adoc
│       ├── 04_solution_strategy.adoc
│       ├── 05_building_block_view.adoc
│       ├── 06_runtime_view.adoc
│       ├── 07_deployment_view.adoc
│       ├── 08_concepts.adoc
│       ├── 09_design_decisions.adoc
│       ├── 10_quality_scenarios.adoc
│       ├── 11_technical_risks.adoc
│       ├── 12_glossary.adoc
│       ├── about-arc42.adoc
│       └── config.adoc
└── pdfTheme
    └── custom-theme.yml

Config.groovy:

outputPath = 'build/docs'

// Path where the docToolchain will search for the input files.
// This path is appended to the docDir property specified in gradle.properties
// or in the command line, and therefore must be relative to it.
inputPath = 'src'

inputFiles = [
              [file: 'arc42/omkring-architecture-description.adoc',            formats: ['html','pdf']],
             ]

taskInputsDirs = ["${inputPath}/arc42/images"]

taskInputsFiles = []

build.gradle:

//configure docToolchain to use the main project's config
project('docToolchain') {                                   
    if (project.hasProperty('docDir')) {                    
        docDir = '../.'                                     
    } else {
        println "="*80                                      
        println "  please initialize the docToolchain submodule"
        println "  by executing git submodule update -i"
        println "="*80
    }
}

Configuration

softmetz commented 3 years ago

I found out that the problem resides in file: 'arc42/omkring-architecture-description.adoc in Config.groovy.

When I change the file to

outputPath = 'build/docs'

// Path where the docToolchain will search for the input files.
// This path is appended to the docDir property specified in gradle.properties
// or in the command line, and therefore must be relative to it.
inputPath = 'src/arc42'

inputFiles = [
              [file: 'omkring-architecture-description.adoc',            formats: ['html','pdf']],
             ]

taskInputsDirs = ["${inputPath}/images/"]

taskInputsFiles = []

it works as expected. So it fails, if a directory is given to the inputFiles attribute.

rdmueller commented 3 years ago

that is a good issue I have to investigate further the next days. In my own projects, I often use a copyImages task as a workaround, but we should try to come up with a real solution.

if you only have one file to convert, you solution might work, but I guess it fails as soon as you have several docs in different folders and thus have different `:imagedir:'s.

softmetz commented 3 years ago

Thanks for looking into this. copyImages is a nice hint, I must have overlooked it. Indeed I plan to have multiple documents, so eventually this will become a show stopper.

If I can provide you with any support, please let me know.

Hellmy commented 3 years ago

Couldn't you just add

    resources {
        from(sourceDir) {
            include '**/images/**'
        }
    }

to the generateHtml task?

But it would also be interessting to make this include somehow configurable as also attachements could be in a subdirectory.

Hellmy commented 2 years ago

Just looked up the current ng-implementation... There the image copy task changed as we now have the imageDirs.

wouldn't it be possible to have a resourceDirs config where I can fully define my source and target directory? config:

[..]
resourceDirs = [
    [source: 'some/images', target: 'other/images']
    /** resourceDirs **/
]
[..]

adapted generateHTML Task:

[..]
    resources {
        config.imageDirs.each { imageDir ->
            from(new File(file(srcDir),imageDir))
            logger.info ('imageDir: '+imageDir)
            into './images'
        }
        config.resourceDirs.each { resource ->
            from(new File(file(srcDir),resource.source))
            logger.info ('resource: '+resource.source)
            into resource.target
        }
    }
[..]

With this I can fully define the input/output structure myself and also can now copy some attachments as well...

I could build the pull request, if this solution is applicable?!

rdmueller commented 2 years ago

first, I have to say that there is a bug in copyImages for generateSite:

https://github.com/docToolchain/docToolchain/blob/6b8cfaa24e8f0c5690596342eceee481c58bf164/scripts/generateSite.gradle#L331

should copy to build/microsite/*output*/images

will be fixed in next release.

Regarding your suggested change: looks good to me. Would you be so kind and also update the docs?

arcusbude commented 2 years ago

Hi everyone, after git bisecting a while i recognized that the commit 33a73747eadbbf05f042835b07c29c35178e49bd breaks on my site the copyImages Task. I dont know if this Issue is the reason for this. For now i have to step back and use the commit before the mentioned commit above to have an running system.

Would be great if you can tell me if my groovy file has an error (i dont have an imageDirs property for example)

my used groovy file

// Path where the docToolchain will produce the output files.
// This path is appended to the docDir property specified in gradle.properties
// or in the command line, and therefore must be relative to it.
outputPath = 'build/all'

inputPath = 'src/main/asciidoc'

inputFiles = [
        [file: "it-all.adoc", formats: ['html','pdf','docbook','docx']],
             ]

imageDirs = [
        "${inputPath}/images/"
]
taskInputsDirs = ["${inputPath}/images/"]

taskInputsFiles = []

//Configuration for exportChangelog
exportChangelog = [:]
changelog.with {

    // Directory of which the exportChangelog task will export the changelog.
    // It should be relative to the docDir directory provided in the
    // gradle.properties file.
    dir = '.'

    // Command used to fetch the list of changes.
    // It should be a single command taking a directory as a parameter.
    // You cannot use multiple commands with pipe between.
    // This command will be executed in the directory specified by changelogDir
    // it the environment inherited from the parent process.
    // This command should produce asciidoc text directly. The exportChangelog
    // task does not do any post-processing
    // of the output of that command.
    //
    // See also https://git-scm.com/docs/pretty-formats
    cmd = 'git log --pretty=format:%x7c%x20%ad%x20%n%x7c%x20%an%x20%n%x7c%x20%s%x20%n --date=short'
}

//tag::htmlSanityCheckConfig[]
htmlSanityCheck.with {
    //sourceDir = "build/html5"
    //checkingResultsDir =
    checkerClasses = ["DuplicateIdChecker", "MissingImageFilesChecker"]
}
//end::htmlSanityCheckConfig[]
Hellmy commented 2 years ago

Did you try to change imageDirs to imageDirs = [ "images/" ]

Because the current code already uses the target path out of the box.

Am 5. November 2021 16:13:16 MEZ schrieb arcusbude @.***>:

Hi everyone, after git bisecting a while i recognized that the commit 33a73747eadbbf05f042835b07c29c35178e49bd breaks on my site the copyImages Task. I dont know if this Issue is the reason for this. For now i have to step back and use the commit before the mentioned commit above to have an running system.

Would be great if you can tell me if my groovy file has an error (i dont have an imageDirs property for example)

my used groovy file

// Path where the docToolchain will produce the output files.
// This path is appended to the docDir property specified in gradle.properties
// or in the command line, and therefore must be relative to it.
outputPath = 'build/all'

inputPath = 'src/main/asciidoc'

inputFiles = [
       [file: "it-all.adoc", formats: ['html','pdf','docbook','docx']],
            ]

imageDirs = [
       "${inputPath}/images/"
]
taskInputsDirs = ["${inputPath}/images/"]

taskInputsFiles = []

//Configuration for exportChangelog
exportChangelog = [:]
changelog.with {

   // Directory of which the exportChangelog task will export the changelog.
   // It should be relative to the docDir directory provided in the
   // gradle.properties file.
   dir = '.'

   // Command used to fetch the list of changes.
   // It should be a single command taking a directory as a parameter.
   // You cannot use multiple commands with pipe between.
   // This command will be executed in the directory specified by changelogDir
   // it the environment inherited from the parent process.
   // This command should produce asciidoc text directly. The exportChangelog
   // task does not do any post-processing
   // of the output of that command.
   //
   // See also https://git-scm.com/docs/pretty-formats
   cmd = 'git log --pretty=format:%x7c%x20%ad%x20%n%x7c%x20%an%x20%n%x7c%x20%s%x20%n --date=short'
}

//tag::htmlSanityCheckConfig[]
htmlSanityCheck.with {
   //sourceDir = "build/html5"
   //checkingResultsDir =
   checkerClasses = ["DuplicateIdChecker", "MissingImageFilesChecker"]
}
//end::htmlSanityCheckConfig[]

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/docToolchain/docToolchain/issues/506#issuecomment-961978936

rdmueller commented 2 years ago

@arcusbude - sorry, we are still improving things and sometimes we break stuff by doing so. By now, you should use the doctoolchain wrapper through which you can fix the version on which you operate. This should keep things stable until you upgrade.

I guess @Hellmy is right that the ${inputPath}is not needed here. Append a --info to your call to docToolchain to see which are the resulting config paramters which are used. This often helps to find problems.

rdmueller commented 2 years ago

@softmetz the initial problem should be solved by now (wow, took quite some time )-: can you confirm that we can close this issue now?

arcusbude commented 2 years ago

@rdmueller thx for your answer. inputpath is still important on my setup

> Task :generateHTML FAILED
:generateHTML (Thread[Execution worker for ':' Thread 6,5,main]) completed. Took 0.05 secs.

FAILURE: Build failed with an exception.

* What went wrong:
A problem was found with the configuration of task ':generateHTML' (type 'AsciidoctorTask').
> Directory '/project/images' specified for property '$1' does not exist.

what is the wrapper ? is there some doc about it ?

iam using an version of an docker container to let doctoolchain with all dependencies run as user. The dockerfile is generated via an template file everytime i run it. so every person in our company can run it.

FROM openjdk:14-jdk-alpine

ARG GRADLE_OPTS

ENV SDKMAN_DIR=/root/.sdkman

# see https://github.com/docker-library/openjdk/issues/73
ENV LC_CTYPE en_US.UTF-8

USER root
RUN echo "xxx:x:1000:1000::/home/xxx:/bin/ash">>/etc/passwd
RUN mkdir /home/xxx
RUN chown xxx /home/xxx
RUN chgrp abuild /home/xxx
USER root
RUN     echo "add needed tools" && \
    apk add --no-cache curl wget zip unzip git bash \
    git \
    graphviz \
    python \
    ruby \
    py-pygments \
    libc6-compat \
    ttf-dejavu && \
    gem install rdoc --no-document && \
    gem install pygments.rb

# Add pandoc
# https://github.com/advancedtelematic/dockerfiles/blob/master/doctools/Dockerfile
#RUN apk add --no-cache cmark --repository http://nl.alpinelinux.org/alpine/edge/testing && \
#    apk add --no-cache --allow-untrusted pandoc --repository https://conoria.gitlab.io/alpine-pandoc/

SHELL ["/bin/bash", "-c"]

# Ausfuehrung von Gradle mit dem gemappten Nutzer.
# Dadurch sind die gecachten Dateien von /home/jenkinsuser/.gradle beim Start des Images mit -u 300:300 zugreifbar

# Hier ggf. den Unternehmensproxy eintragen
ENV GRADLE_OPTS=$GRADLE_OPTS

RUN echo "Install sdkman" &&\
    curl -s "https://get.sdkman.io" | bash && \
    chmod +x $HOME/.sdkman/bin/sdkman-init.sh && \
    $HOME/.sdkman/bin/sdkman-init.sh

RUN     echo "Install java, groovy" && \
    source $HOME/.sdkman/bin/sdkman-init.sh
#    sdk install groovy 2.5.5

USER xxx

RUN     cd /home/xxx && \
        pwd && \
        env | sort && \
        git clone https://github.com/docToolchain/docToolchain.git && \
        cd docToolchain && \
        pwd && \
# geht git checkout -b stand_2fe3c6fb2d3230385070b830a5f91cc292436096 2fe3c6fb2d3230385070b830a5f91cc292436096 && \
# geht nicht    git checkout -b ng_e2c1d8f73561a1988495d5db16110f0fbf5fb4ee e2c1d8f73561a1988495d5db16110f0fbf5fb4ee && \
# geht nicht    git checkout -b stand_6b8cfaa24e8f0c5690596342eceee481c58bf164 6b8cfaa24e8f0c5690596342eceee481c58bf164 && \
# geht nicht    git checkout -b stand ac460da && \
# geht nicht    git checkout -b stand 70c07bd && \
# geht nicht    git checkout -b stand 842250b && \
# geht nicht    git checkout -b stand e92bb73 && \
# geht nicht    git checkout -b stand 33a7374 && \
# geht !!!!     git checkout -b stand 75bf155 && \
# geht !!!!     git checkout -b stand c7e3939 && \
# geht !!!!             git checkout -b stand f876576 && \
#               git checkout -b letzter_funktionierener_stand 75bf155 && \
git log --oneline -n 20 && \
        git submodule update -i && \
        # remove .git folders
        rm -rf `find -type d -name .git` && \
                ./gradlew tasks && \
        PATH="/home/xxx/docToolchain/bin:${PATH}"

ENV PATH="/home/xxx/docToolchain/bin:${PATH}"

USER root

# Reinstall any system packages required for runtime of pandoc.
RUN apk --no-cache add \
gmp \
libffi \
lua5.3 \
lua5.3-lpeg

COPY --from=pandoc/core:2.9 \
/usr/bin/pandoc \
/usr/bin/pandoc-citeproc \
/usr/bin/

RUN mkdir /project

WORKDIR /project

VOLUME /project

ENTRYPOINT /bin/bash

As you can see iam using ng branch (default) inside the container and have tried multiple commits until i found the commit with destroys copying images for me ;)

with the actual commit 31637a1b13eaab284a41b0c31ed06c842aaa1238 there is still the bug that images are not copied.

an extra question: is there a way to include some binary files for copying to the build html folder for referencing it in the resulting html page. We are using it for documenting our it structure and i want to include a link to the actuall selfsigned ca cert, which i want to include as download link in the documentation.

Thx for your work, we are at a good way in documenting everything because of doctoolchain. (first contact was before 3 or 4 years at java user group in Dresden ;))

Hellmy commented 2 years ago

Did you really define '/project/images' ? shouldn't that be a 'project/images' as you are working from the current directory.

outsideMyBox commented 1 year ago

Hi, I also ran to a similar issue as @softmetz when generating docbook and docx when a directory is given to the inputFiles attribute. The images wouldn't be copied to the build directory I wanted and the docx reference file wouldn't be found. I solved the first issue by using resourceDirs instead of imageDirs:

inputFiles = [
        [file: "myRelativePath${fs}myFile.adoc", formats: ['html','pdf', 'docbook', 'docx']],
]
resourceDirs = [
        [source: 'myRelativePath${fs}images', target: 'myRelativePath${fs}images']
]

I also patched pandle.gradle as it seems the problem lies when a relative path is used instead of a plain filename:

    def fs = File.separator
    sourceFilesDocx.each {
        def sourceFileRelativePath = Paths.get(it.file).getParent()
        def adocSourceFileName = new File(it.file).getName()
        def docBookSourceFileName = adocSourceFileName.replaceAll('.adoc$', '.xml')
        def docxTargetFileName = adocSourceFileName.replaceAll('.adoc$', '.docx')
        def docxFilePath = "${targetDir}${fs}docx${fs}${sourceFileRelativePath}${fs}${docxTargetFileName}"
        def refDocFilePath = "${docDir}${fs}${referenceDocFile}"
        logger.info "docDir: ${docDir}"
        logger.info "targetDir: ${targetDir}"
        logger.info "sourceFileRelativePath: ${sourceFileRelativePath}"
        logger.info "docBookSourceFileName: ${docBookSourceFileName}"
        logger.info "docxFilePath: ${docxFilePath}"
        logger.info "refDocFilePath: ${refDocFilePath}"

        workingDir "$targetDir${fs}docbook${fs}${sourceFileRelativePath}"
        executable = "pandoc"
        if(referenceDocFile?.trim()) {
            args = ["-r", "docbook",
                    "-t", "docx",
                    "-o", docxFilePath,
                    "--reference-doc=${refDocFilePath}",
                    docBookSourceFileName]
        } else {
            args = ["-r", "docbook",
                    "-t", "docx",
                    "-o", docxFilePath,
                    docBookSourceFileName]
        }
    }

Note: I use dtcw.ps1 and was not sure first that using the file separator '/' was the problem, hence the ${fs} :)

rdmueller commented 1 year ago

thanx for this feedback. Using File.separator is definitivly the better way!

Would you mind to create a Pull-Request for your patch?

mh182 commented 1 year ago

Is this bug still present? Could we reproduce it somehow?

Otherwise we should close it.

mh182 commented 1 year ago

The original bug report is tested and fixed.

➜ DTC_VERSION=latestdev ./dtcw downloadTemplate
dtcw 0.50 - ##DTCW_GIT_HASH##
docToolchain latestdev - 04225a57
OS/arch: Linux/x86_64
Available docToolchain environments: local sdk docker
Environments with docToolchain [latestdev]: local docker
Using environment: local
Using Java 17.0.6 [/home/max/.local/share/sdkman/candidates/java/current/bin/java]
To honour the JVM settings for this build a single-use Daemon process will be forked. See https://docs.gradle.org/7.5.1/userguide/gradle_daemon.html#sec:disabling_the_daemon.
Daemon will be stopped at the end of the build 

> Configure project :

Config file '/home/max/work/dtc/test-dtcw-3/docToolchainConfig.groovy' does not exist' 
[ant:input] 
[ant:input] do you want me to create a default one for you? (y, n)
<<-------------> 0% CONFIGURING [7s]

> Task :downloadTemplate
Install arc42 documentation template.
For more information about arc42 see https://arc42.org
[ant:input] Which language do you want to install? (CZ, EN, DE, ES, IT, NL, UA)
<-------------> 0% EXECUTING [14s]
[ant:input] Do you want the template with or without help? (withhelp, plain)
<--<-<-------------> 0% EXECUTING [19s]
Download https://github.com/arc42/arc42-template/raw/master/dist/arc42-template-EN-withhelp-asciidoc.zip
arc42 template unpacked into /home/max/work/dtc/test-dtcw-3/src/docs/arc42
added template to docToolchainConfig.groovy
use 'generateHTML', 'generatePDF' or  'generateSite' to convert the template

BUILD SUCCESSFUL in 23s
1 actionable task: 1 executed

work/dtc/test-dtcw-3 took 23s 
➜ DTC_VERSION=latestdev ./dtcw generateHTML 
dtcw 0.50 - ##DTCW_GIT_HASH##
docToolchain latestdev - 04225a57
OS/arch: Linux/x86_64
Available docToolchain environments: local sdk docker
Environments with docToolchain [latestdev]: local docker
Using environment: local
Using Java 17.0.6 [/home/max/.local/share/sdkman/candidates/java/current/bin/java]
To honour the JVM settings for this build a single-use Daemon process will be forked. See https://docs.gradle.org/7.5.1/userguide/gradle_daemon.html#sec:disabling_the_daemon.
Daemon will be stopped at the end of the build 

> Task :generateHTML
Converting /home/max/work/dtc/test-dtcw-3/src/docs/arc42/arc42.adoc

BUILD SUCCESSFUL in 15s
1 actionable task: 1 executed

work/dtc/test-dtcw-3 took 15s 
➜ tree build/
build/
└── html5
    ├── arc42
    │   └── arc42.html
    └── images
        ├── 01_2_iso-25010-topics-EN.drawio.png
        ├── 05_building_blocks-EN.png
        ├── 08-Crosscutting-Concepts-Structure-EN.png
        └── arc42-logo.png