Closed groogiam closed 1 year ago
Setup api docs metadata as defined in...
You are referencing Walkthrough Part II: Adding API Documentation to the Website, the pdf tutorial is in Walkthrough Part III: Generate PDF Documentation.
If you still have any problem, post your docfx.json
file.
My docfx.json file is below.
This looks like an environmental issue where the metadata target cannot read the project file on my local machine. I get a bunch of warnings just like this.
with message: Method not found: 'System.ReadOnlySpan
11<Char>)'.
Is there a dependency I am missing on my local machine for extracting metadata. The machine has the .NET 6 SKD and Visual Studio 2022?
On my build server the metadata is generated but the pdf generation fails with
[22-08-27 08:35:07.095]Error:[PdfCommand.PDF]Error happen when converting pdf/toc.json to Pdf. Details: System.AggregateException: One or more errors occurred. ---> iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object <p0>)
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Func`4 bodyWithLocal, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.ForEachWorker[TSource,TLocal](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.ForEach[TSource](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.GetPartialPdfModels(IList`1 htmlFilePaths)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.ConvertOutlines()
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.GetOutlines()
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.SaveCore(Stream stream)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Save(String outputFileName)
at Microsoft.DocAsCode.HtmlToPdf.ConvertWrapper.<>c__DisplayClass7_0.<ConvertCore>b__1(ManifestItem tocFile)
---> (Inner Exception #0) iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
{
"metadata": [
{
"src": [
{
"src": "../src",
"files": [
"MyProject.Api/**.csproj",
"MyProject.Web.Ui/**.csproj",
"MyProject.Server/**.csproj"
]
}
],
"dest": "api",
"disableGitFeatures": false,
"filter": "apiFilterConfig.yml"
}
],
"build": {
"content": [
{
"files": [
"api/**.yml",
"api/index.md"
]
},
{
"files": [
"user_guide/**",
"user_guide/installation/**",
"toc.yml",
"*.md",
"rest_api/**"
]
}
],
"resource": [
{
"files": [
"images/**",
"**/media/**"
]
}
],
"overwrite": [
{
"files": [
"apidoc/**.md"
],
"exclude": [
"obj/**",
"_site/**"
]
}
],
"globalMetadata": {
"_appLogoPath": "images/logo.png",
"_appFaviconPath": "images/favicon.ico",
"_enableSearch": true,
"_enableNewTab": true,
"_disableContribution": true
},
"dest": "_site",
"globalMetadataFiles": [],
"fileMetadataFiles": [],
"template": [
"default",
"template"
],
"postProcessors": [],
"markdownEngineName": "markdig",
"noLangKeyword": false,
"keepFileLink": false,
"cleanupCacheHistory": false,
"disableGitFeatures": false
},
"pdf": {
"content": [
{
"files": [
"api/**.yml",
"api/index.md"
],
"exclude": [
"**/toc.yml",
"**/toc.md"
]
},
{
"files": [
"user_guide/**",
"user_guide/installation/**",
"toc.yml",
"*.md",
"rest_api/**",
"pdf/*"
],
"exclude": [
"**/bin/**",
"**/obj/**",
"_site_pdf/**",
"**/toc.yml",
"**/toc.md"
]
},
{
"files": "pdf/toc.yml"
}
],
"resource": [
{
"files": [
"images/**",
"**/media/**"
],
"exclude": [
"**/bin/**",
"**/obj/**",
"_site_pdf/**"
]
}
],
"overwrite": [
{
"files": [
"apidoc/**.md"
],
"exclude": [
"**/bin/**",
"**/obj/**",
"_site_pdf/**"
]
}
],
"wkhtmltopdf": {
"additionalArguments": "--enable-local-file-access"
},
"dest": "_site_pdf",
"template": [
"pdf.default",
"template"
]
}
}
with message: Method not found: 'System.ReadOnlySpan1 Microsoft.IO.Path.GetFileName(System.ReadOnlySpan1
)'.
This is a reported issue and not yet resolved. There are, however, some tips to work around - see if any of these will help your setup.
@paulushub Thanks. I got some time to do some more research and it seems like these are the relevant issues.
Error On Build Server: PDF header signature not found (Error happen when conversion toc.json to Pdf) · Issue #4999 · dotnet/docfx · GitHub PDF Build fails in Azure DevOps · Issue #4488 · dotnet/docfx · GitHub
Error Locally https://github.com/dotnet/docfx/issues/8143 https://github.com/dotnet/docfx/issues/8102 Work Around Locally https://github.com/dotnet/docfx/issues/8136#issuecomment-1219512721
@groogiam With the issue fixed, how about the PDF output?
@paulushub I can generate the pdf output from the command line from my local machine but it still seems to fail on an azure devops agent. Even with "noStdin": true
The metadata generation looks like it is working again though and generating pdf output.
but it still seems to fail on an azure devops agent. Even with "noStdin": true
Any error messages?
@paulushub
This error happens both locally and in devops when running with "noStdin": true
workflow.html" - has exception, the details: The filename or extension is
too long
[22-09-15 09:40:22.575]Error:[PDF]Error happen when converting pdf/toc.json to Pdf. Details: iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.SaveCore(Stream stream)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Save(String outputFileName)
at Microsoft.DocAsCode.HtmlToPdf.ConvertWrapper.<>c__DisplayClass7_0.<ConvertCore>b__1(ManifestItem tocFile)
If I run without "noStdin": true
then it works locally but I get the following error in my devops pipeline.
[22-09-15 09:52:38.607]Error:[PDF]Error happen when converting pdf/toc.json to Pdf. Details: System.AggregateException: One or more errors occurred. ---> iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object <p0>)
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Func`4 bodyWithLocal, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.ForEachWorker[TSource,TLocal](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.ForEach[TSource](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.GetPartialPdfModels(IList`1 htmlFilePaths)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.ConvertOutlines()
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.GetOutlines()
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.SaveCore(Stream stream)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Save(String outputFileName)
at Microsoft.DocAsCode.HtmlToPdf.ConvertWrapper.<>c__DisplayClass7_0.<ConvertCore>b__1(ManifestItem tocFile)
---> (Inner Exception #0) iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object <p0>)<---
---> (Inner Exception #1) iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object <p0>)<---
---> (Inner Exception #2) iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
---> (Inner Exception #3) iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object <p0>)<---
---> (Inner Exception #4) iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object <p0>)<---
---> (Inner Exception #5) iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object <p0>)<---
---> (Inner Exception #6) iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object <p0>)<---
---> (Inner Exception #7) iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Convert[T](String arguments, Func`2 readerFunc)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.<>c__DisplayClass7_0.<GetPartialPdfModels>b__1(String htmlFilePath)
at System.Threading.Tasks.Parallel.<>c__DisplayClass17_0`1.<ForWorker>b__1()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass176_0.<ExecuteSelfReplicating>b__0(Object <p0>
This error happens pretty early in the processing where the error with noStdIn happens very late.
Thanks for your help.
@groogiam Thanks for the updates.
iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
A search indicates two sources of the error:
HtmlToPdfConverter.cs
as pdfStream.Position = 0;
.I had the similar issue with docfx 2.59.4.0 when running docxf from GitHub Actions.
I've added the following flags to docfx.json
and the problem is gone:
-q
flag to wkhtmltopdf arguments
"wkhtmltopdf": {
"additionalArguments": "-q --enable-local-file-access"
},
"noStdin": true
option@nprorekhin Thanks for the additional information. I'm in the process of testing on azure dev ops but this configuration does not work when building on my local machine. It results in the The filename or extension is too long
noted previously.
Just finished testing on my Azure Dev Ops windows agent and adding the configuration
"wkhtmltopdf": { "additionalArguments": "-q --enable-local-file-access" }, "noStdin": true
results in this error
[22-12-16 01:12:36.781]Error:[PDF]Error happen when converting pdf/toc.json to Pdf. Details: iTextSharp.text.exceptions.InvalidPdfException: PDF header signature not found.
at iTextSharp.text.pdf.PdfReader..ctor(ReaderProperties properties, IRandomAccessSource byteSource)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.SaveCore(Stream stream)
at Microsoft.DocAsCode.HtmlToPdf.HtmlToPdfConverter.Save(String outputFileName)
at Microsoft.DocAsCode.HtmlToPdf.ConvertWrapper.<>c__DisplayClass7_0.<ConvertCore>b__1(ManifestItem tocFile)
Addressed in v2.73.0 with a new PDF engine.
@yufeih There still seems to be issue with the new engine when running on CI. See below. Thanks.
api\toc.pdf: 98%
TimeoutException: Timeout 30000ms exceeded.
=========================== logs ===========================
navigating to
"[http://127.0.0.1:55059/api/IconNames.html",](http://127.0.0.1:55059/api/IconNames.html%22,) waiting
until "domcontentloaded"
============================================================
at async Task<T> InnerSendMessageToServerAsync<T>(string guid, string method,
Dictionary<string, object> dictionary, bool keepNulls) in Connection.cs:214
at async Task<T> WrapApiCallAsync<T>(Func<Task<T>> action, bool isInternal) in
Connection.cs:521
at async Task<IResponse> GotoAsync(string url, FrameGotoOptions options) in
Frame.cs:617
at void MoveNext() in PdfBuilder.cs:150
at void MoveNext() in PdfBuilder.cs:178
at void MoveNext()
at async Task CreatePdf(Func<Uri, Task<byte[]>> printPdf, ProgressTask task,
Uri outlineUrl, Outline outline, string outputPath,
Action<Dictionary<Outline, int>> updatePageNumbers) in PdfBuilder.cs:169
at void MoveNext() in PdfBuilder.cs:89
at void MoveNext()
at void MoveNext() in PdfBuilder.cs:82
at void MoveNext() in Progress.cs:98
at void MoveNext() in Progress.cs:133
at async Task<T> RunAsync<T>(Func<Task<T>> func) in DefaultExclusivityMode.cs:
40
at async Task<T> StartAsync<T>(Func<ProgressContext, Task<T>> action) in
Progress.cs:116
at async Task StartAsync(Func<ProgressContext, Task> action) in Progress.cs:96
at async Task CreatePdf(string outputFolder) in PdfBuilder.cs:80
at async Task CreatePdf(string outputFolder) in PdfBuilder.cs:80
at async Task CreatePdf(string outputFolder) in PdfBuilder.cs:80
at void <Execute>b__0() in PdfCommand.cs:19
at int Run(LogOptions options, Action run) in CommandHelper.cs:43
at int Execute(CommandContext context, PdfCommandOptions options) in
PdfCommand.cs:14
at Task<int> Execute(CommandContext context, CommandSettings settings) in
CommandOfT.cs:40
at Task<int> Execute(CommandTree leaf, CommandTree tree, CommandContext
context, ITypeResolver resolver, IConfiguration configuration) in
CommandExecutor.cs:144
at async Task<int> Execute(IConfiguration configuration, IEnumerable<string>
args) in CommandExecutor.cs:85
at async Task<int> RunAsync(IEnumerable<string> args) in CommandApp.cs:84
@yufeih
The file that is failing is an api generated file with a very large amount of members. The generated html for the file is 32k lines. I can reproduce both in the Azure pipeline and by running manually on my CI server. It appears that the default timeout is not long enough to handle large files on older hardware. Is there a way to change this timeout? If not it seems like there should be or at least the timeout should be increased to provide support for older hardware.
A timeout sounds reasonable.
@yufeih Thanks for the quick turn around. Any idea when the .net tool with this change will be released. Thanks.
Operating System: (
Windows
orLinux
orMacOS
) Windows 10 DocFX Version Used: 2.59.3.0Template used: (
default
orstatictoc
or contain custom template) Default Steps to Reproduce:docfx pdf docfx.json
Expected Behavior:
Pdf should generate without error.
Actual Behavior:
I get an error saying
api\toc.yml does not exist
and the pdf contains no api documentation.Running
docfx build docfx.json
generates the api docs.