dotnet / docfx

Static site generator for .NET API documentation.
https://dotnet.github.io/docfx/
MIT License
4.09k stars 866 forks source link

[Feature Request] Generated HTML should be valid XML (XHTML) #10191

Closed icnocop closed 2 months ago

icnocop commented 2 months ago

When generating HTML docs, I expected the HTML docs to be valid XML (XHTML) so that I can reference elements using XPath queries for further manipulation for example. However, even when templates contain valid XHTML, the generated HTML is not.

For example, See line 10 in .\templates\default\toc.html.primary.tmpl: <input type="text" id="toc_filter_input" placeholder="{{__global.tocFilter}}" onkeypress="if(event.keyCode==13) {return false;}">

For example, in all the generated html files, <input> elements are not closed. This can also be seen with <br> elements. I expected the input elements to be self closing or explicitly closed. I.e. <input /> or <input></input>

This could be an issue in HtmlAgilityPack. See https://github.com/zzzprojects/html-agility-pack/issues/330

Even after updating line 10 in .\templates\default\toc.html.primary.tmpl to: <input type="text" id="toc_filter_input" placeholder="{{__global.tocFilter}}" onkeypress="if(event.keyCode==13) {return false;}" /> or <input type="text" id="toc_filter_input" placeholder="{{__global.tocFilter}}" onkeypress="if(event.keyCode==13) {return false;}" ></input> the input tag in the generated toc.html is still not closed.

Maybe these issues are related: https://github.com/dotnet/docfx/issues/263 https://github.com/dotnet/docfx/issues/7672

Thank you.

icnocop commented 2 months ago

Thanks to @filzrev I was able to add a post processor to do this.

docfx.json:

        "postProcessors": [ "XHtmlPostProcessor" ],

XHtmlPostProcessor.cs:

    using System;
    using System.Collections.Immutable;
    using System.Composition;
    using System.Linq;
    using System.Text;
    using Docfx.Common;
    using Docfx.Plugins;
    using HtmlAgilityPack;

    [Export(nameof(XHtmlPostProcessor), typeof(IPostProcessor))]
    public class XHtmlPostProcessor : IPostProcessor
    {
        /// <inheritdoc/>
        public ImmutableDictionary<string, object> PrepareMetadata(ImmutableDictionary<string, object> metadata)
        {
            return metadata;
        }

        /// <inheritdoc/>
        public Manifest Process(Manifest manifest, string outputFolder)
        {
            ArgumentNullException.ThrowIfNull(manifest);
            ArgumentNullException.ThrowIfNull(outputFolder);

            foreach (var tuple in from item in manifest.Files ?? Enumerable.Empty<ManifestItem>()
                                  from output in item.Output
                                  where output.Key.Equals(".html", StringComparison.OrdinalIgnoreCase)
                                  select new
                                  {
                                      Item = item,
                                      InputFile = item.SourceRelativePath,
                                      OutputFile = output.Value.RelativePath,
                                  })
            {
                if (!EnvironmentContext.FileAbstractLayer.Exists(tuple.OutputFile))
                {
                    continue;
                }

                var document = new HtmlDocument
                {
                    OptionWriteEmptyNodes = true, // generate valid XHTML
                };
                try
                {
                    using var stream = EnvironmentContext.FileAbstractLayer.OpenRead(tuple.OutputFile);
                    document.Load(stream, Encoding.UTF8);
                }
                catch (Exception ex)
                {
                    Logger.LogWarning($"Warning: Can't load content from {tuple.OutputFile}: {ex.Message}");
                    continue;
                }

                using (var stream = EnvironmentContext.FileAbstractLayer.Create(tuple.OutputFile))
                {
                    document.Save(stream, Encoding.UTF8);
                }
            }

            return manifest;
        }
    }