Introduce DITA validator step

IanMayo commented 1 year ago

Note: Today I ran this DITA validator plugin: https://jason-fox.github.io/dita-ot-plugins/validate.svrl/index.html

We played with it back in July. Here are the instructions to install/run it: https://github.com/DeepBlueCLtd/LegacyMan/issues/271#issuecomment-1655606339

But, I didn't progress it any further, since the DITA-OT publish process seemed to identify any broken references. I now realise the DITA-Validator does more checking. Actually, in its default mode it produces warnings for things are acceptable in DITA, but which just aren't part of best practice.

Today the validator highlighted an instance of missing content - which could prove useful on our target dataset that is too large to inspect by eye.

We should probably run this command before we publish, and we should probably fail on any errors:

dita --format svrl --input target/dita/index.ditamap --args.validate.ignore.rules=href-not-lower-case,running-text-lorem-ipsum,id-not-lower-case,section-id-missing,fig-title-missing

robintw commented 1 year ago

Fixed in #450. Output now looks like this:

INFO:  Logging level set to INFO
INFO:  LegacyMan parser running, with these arguments: data ./target/html
INFO:  Done run 1
INFO:  Run 1 took 2.0 seconds
INFO:  Done run 2
INFO:  Run 2 took 1.6 seconds
INFO:  Running dita validation command - output below is errors/warnings directly from the dita command

[WARN]  []
  Line 5: topic[id="links_1"] - [topic-file-mismatch]
The value specified in id="links_1" does not match the file name: regions.dita. Make sure the ID value and the file name are the same.
[WARN]  []
  Line 4: topic[id="links_1"] - [topic-file-mismatch]
The value specified in id="links_1" does not match the file name: welcome.dita. Make sure the ID value and the file name are the same.

No Errors Found 2 Warnings
INFO:  Running dita publish command - output below is errors/warnings directly from the dita command
INFO:  Running DITA to HTML conversion took 2.2e+01 seconds
INFO:  Timings:
INFO:  Run 1: 2.0 seconds
INFO:  Run 2: 1.6 seconds
INFO:  DITA conversion: 2.2e+01 seconds
INFO:  Total: 2.5e+01 seconds

IanMayo commented 1 year ago

Thanks @robintw - do we fail if validator throws an error?

robintw commented 1 year ago

Ah sorry, I meant to mention that. We don't currently - in my opinion it'd be useful to automatically get the errors from both the validator and the publish command, and if we fail hard on an error in the validator then there's no easy way to run the process to completion without fixing every error - which we may not want (or be able) to do in development.

Does that make sense?

IanMayo commented 1 year ago

That's completely fine :-D

DeepBlueCLtd / LegacyMan

Introduce DITA validator step #454