Closed ogix closed 1 month ago
As a workaround, download the stream into a MemoryStream and use that with PdfReader.
Adapting PDFsharp to RetriableStream will probably require several changes, so to resolve this issue, PDFsharp would probably copy the stream to a MemoryStream internally anyway. If this will be addressed in PDFsharp.
Ok, thanks. Thought that maybe it can take advantage of real Streaming.
Ok, thanks. Thought that maybe it can take advantage of real Streaming.
What is "real Streaming" and what could the advantages be?
I mean avoid loading the whole document(s) into the memory. In my case I have multiple pdf documents that I merge into one. And this operation is common in my web app. So the only option now is to load all documents into memory that leads to high memory usage.
Azure Storage SDK BlobClient.DownloadStreamingAsync
returns Stream that downloads document in chunks rather the whole at once.
So I am thinking if it's possible to do reading such stream in PDFSharp..
PDFsharp reads the complete PDF into memory, that's how it works. Reading data from BobClient into a MemoryStream increases the memory usage, but that should not be an issue with PDF files downloaded from Azure. No need to open more than one source PDF at any time.
Thanks for explaining. Just wanted to know if it's possible. Closing.
Or at least maybe we should keep it open to add support for non-seekable Streams..
I am trying to pass Stream that I get from
BlobClient.DownloadStreamingAsync
into PdfRead.Open method and it throws when trying to get Length property. It looks like it is usingRetriableStream
under the hood and it is not seekable.https://github.com/empira/PDFsharp/blob/5fbf6ed14740bc4e16786816882d32e43af3ff5d/src/foundation/src/PDFsharp/src/PdfSharp/Pdf.IO/PdfReader.cs#L285