aldaris / docbook-linktester

Tests external links and olinks in DocBook documentation for validity
3 stars 1 forks source link

Add <directory> to limit what files are tested #6

Closed markcraig closed 9 years ago

markcraig commented 10 years ago

This patch adds an optional base <directory> setting, useful when you want to run docbook-linktester only on pre-processed files with concrete links (not links that include Maven properties for example), and only on DocBook XML sources (not site files for example).

At present it's possible to end up with many "false positives" that make it harder to see what's really important in the output. For an example, see https://builds.forgerock.org/view/Docs/job/OpenAM%20Java%20EE%20Agent%20Docs/ws/target/docbkx/linktester.err/*view*/.

aldaris commented 10 years ago

There is one thing I'm not too certain about, how does this really handle generated files now? To be more precise, should we have the directory setting multivalued (using directory=src/main/docbkx kind of prevents finding files under target/generated-(re)sources)? Is this instead more of an issue with wrong default values for includes and excludes?

markcraig commented 10 years ago

Is there a way to do this with <includes> and <excludes> settings. What is the right way to express, "All the index.xml files under the ${project.build.directory}/docbkx-generated/ directory?"

I guess I expect <includes> to be a union, rather than an intersection, so:

<includes>
    <include>${project.build.directory}/docbkx-generated</include>
    <include>**/index.xml</include> 
</includes>

Would mean "All the index.xml files, plus all files under ${project.build.directory}/docbkx-generated."

My expectation is that, after preprocessing, there will be multiple copies of the files that in the project directory that could match includes/excludes. At the very least, there will be the un-preprocessed sources and the preprocessed sources. But there will be only one set of documents that should be validated/link-checked/etc., and all those documents will be in some well-known directory, the one holding the documents that have been fully pre-processed. With <directory> and <includes> we can set both the location and the filter.

The way is maybe not the best way to do this. I just haven't thought of a better way, yet. If you can see a better way, then by all means, let's implement that instead.

aldaris commented 10 years ago

In which case I would suggest to separate the preprocessed docs files from the ones that doesn't require preprocessing. That way you could have something like:

<includes>
  <include>**/*.index.xml</include>
</includes>
<excludes>
  <exclude>src/main/docbkx-filterable</exclude>
</excludes>

My main concern with this solution is that it may require multiple plugin to get through the whole documentation, and then the olink matching may not work if for example preprocessed files are referenced from unprocessed files or vice versa.

markcraig commented 10 years ago

Okay, I'll try setting up the plugin to exclude all directories we know shouldn't be checked (src/main/docbkx, src/site, and whatever directories under the build.directory have not been pre-processed, if any).

markcraig commented 10 years ago

It looks like I can indeed use <excludes> for this, and have a simple patch in preparation for forgerock-doc-maven-plugin to handle the exclusions.

aldaris commented 10 years ago

Based on your comments I'm closing this PR, the solution for this issue is to ensure that filterable documentation files are excluded, and the filtered files should be included normally.

aldaris commented 10 years ago

I must agree with http://sources.forgerock.org/cru/CR-5248#c49909 . I believe the directory setting is the right way of dealing with absolute paths coming from Maven project properties (like ${project.build.directory}), and it certainly shouldn't be the plugin invoker's job to get rid of the absolute path in order to be able to use includes and excludes correctly. I'm a bit unsure yet though if this should be modified so more than one directory can be provided (similarly to ). What do you think, would that be useful for the docs?

markcraig commented 10 years ago

Looks like perhaps your message got truncated. What is the use case you had in mind? Were you going to write "similarly to <includes>?"

markcraig commented 10 years ago

Thinking about this, I wonder if we might have builds with multiple documentation sets at some point -- for example a project that integrates two products -- or a project with both Asciidoc and also DocBook sources that are kept separate somehow during the build process.

aldaris commented 10 years ago

So I've been thinking about something like:

<docSources>
 <docSource>
  <directory>${project.build.directory}</directory>
  <includes>
   <include>**/index.xml</include>
  </includes>
 </docSource>
 <docSource>
  ...
 </docSource>
</docSources>
markcraig commented 10 years ago

That looks good to me.