ststeiger / PdfSharpCore

Port of the PdfSharp library to .NET Core - largely removed GDI+ (only missing GetFontData - which can be replaced with freetype2)
Other
1.07k stars 239 forks source link

[Question] How to import data? #302

Open levicki opened 2 years ago

levicki commented 2 years ago

MigraDoc documentation states that you can:

Import data from various sources via XML files or direct interfaces (any data source that can be used with .NET)

But there is no documentation (that I could find) on how to accomplish this at all.

I am trying to write a MD to PDF converter using Markdig (which gives me HTML output), and other than loading HTML using say HtmlAgility pack, going recursively through the DOM tree, and adding nodes manually to a MigraDoc Document() I see no other way.

Please advise.

packdat commented 2 years ago

I think the sentence from the documentation could be written as You can use any data-source but it's up to you to utilize MigraDoc and/or PdfSharp to generate a Pdf from it The steps you described basically are the way to go IMHO.

Years ago, working with the original version of PdfSharp, i had some success generating Pdf from HTML with this library.
(using it's PdfSharp - backend) I am not aware of any port of this library that uses PdfSharpCore, but you could create one...

levicki commented 2 years ago

@packdat

I already made something using wkhtmltopdf instead of PdfSharp and MigraDoc (I found an abandoned C# wrapper for it and used that because I needed a quick solution).

The output is passable, but I will look into using Chrome's PDF renderer when I have some free time. Thanks for the suggestion anyway.

As for the documentation change proposal, I disagree -- the word "data-source" in C# context to me implies something compatible with DataSet or perhaps with some other sort of interface capable of handling highly structured data.

It certainly doesn't imply a literal firehose of raw data, which should be manually processed recursively, hence my request for clarification.

EDIT: I guess what I am trying to say is that wording "data-source" (or even "data source") is too strongly associated with ADO.NET and databases so it doesn't really make much sense to me in this context.

ststeiger commented 1 year ago

@levicki: Sadly, wkhtmltopdf uses QT-Webkit, which is an ancient version of Chrome/Chromium-Embedded. You're better off using the Chrome-Devtools protocol, using headless-chrome as server.

There are several implementations for C#, see nuget. The oldest ist MasterDevs/ChromeDevTools.

Some links: https://github.com/MasterDevs/ChromeDevTools https://github.com/quamotion/ChromeDevTools https://github.com/ToCSharp/AsyncChromeDriver https://github.com/BaristaLabs/chrome-dev-tools-generator https://github.com/BaristaLabs/chrome-dev-tools-runtime

Something along the lines of:

using System; using System.Collections.Generic; using System.Linq; using System.Threading.Tasks;

using Portal_Convert.CdpConverter;

using MasterDevs.ChromeDevTools; using MasterDevs.ChromeDevTools.Protocol.Chrome.Browser; using MasterDevs.ChromeDevTools.Protocol.Chrome.Page; using MasterDevs.ChromeDevTools.Protocol.Chrome.Target; using MasterDevs.ChromeDevTools.Protocol.Chrome.Runtime; using MasterDevs.ChromeDevTools.Protocol.Chrome.Network;

namespace GoogleRemoteControl {

public class ChromiumBasedConverter
{

    public static void KillHeadlessChromes(System.IO.TextWriter writer)
    {
        System.Diagnostics.Process[] allProcesses = System.Diagnostics.Process.GetProcesses();

        string exeName = @"\chrome.exe";

        if (System.Environment.OSVersion.Platform == System.PlatformID.Unix)
        {
            exeName = "/chrome";
        } // End if (System.Environment.OSVersion.Platform == System.PlatformID.Unix)

        for (int i = 0; i < allProcesses.Length; ++i)
        {
            System.Diagnostics.Process proc = allProcesses[i];
            string commandLine = ProcessUtils.GetCommandLine(proc); // GetCommandLineOfProcess(proc);

            if (string.IsNullOrEmpty(commandLine))
                continue;

            commandLine = commandLine.ToLowerInvariant();

            if (commandLine.IndexOf(exeName, System.StringComparison.InvariantCultureIgnoreCase) == -1)
                continue;

            if (commandLine.IndexOf(@"--headless", System.StringComparison.InvariantCultureIgnoreCase) != -1)
            {
                writer.WriteLine($"Killing process {proc.Id} with command line \"{commandLine}\"");
                ProcessUtils.KillProcessAndChildren(proc.Id);
            } // End if (commandLine.IndexOf(@"--headless") != -1)

        } // Next i 

        writer.WriteLine($"Finished killing headless chromes");
    } // End Sub KillHeadless 

    public static void KillHeadlessChromes()
    {
        KillHeadlessChromes(System.Console.Out);
    }

    public static System.Collections.Generic.List<string> KillHeadlessChromesWeb()
    {
        System.Collections.Generic.List<string> ls = new System.Collections.Generic.List<string>();
        System.Text.StringBuilder sb = new System.Text.StringBuilder();

        using (System.IO.StringWriter sw = new System.IO.StringWriter(sb))
        {
            KillHeadlessChromes(sw);
        } // End Using sw 

        // "abc".Replace("\r\n", "\n").Replace("\r", "\n");
        // "abc".Replace("" & vbCrLf, "" & vbLf).Replace("" & vbCr, "" & vbLf)
        // "abc".Split(vbLf);
        using (System.IO.TextReader tr = new System.IO.StringReader(sb.ToString()))
        {
            string thisLine = null;
            while ((thisLine = tr.ReadLine()) != null)
            {
                ls.Add(thisLine);
            } // Whend 
        } // End Using tr 

        sb.Length = 0;
        sb = null;

        return ls;
    } // End Function KillHeadlessChromesWeb 

    private static async System.Threading.Tasks.Task InternalConnect(ConnectionInfo ci, string remoteDebuggingUri)
    {
        ci.ChromeProcess = new RemoteChromeProcess(remoteDebuggingUri);
        ci.SessionInfo = await ci.ChromeProcess.StartNewSession();
    } // End Function InternalConnect 

    private static async System.Threading.Tasks.Task<ConnectionInfo> ConnectToChrome(string chromePath, string remoteDebuggingUri)
    {
        ConnectionInfo ci = new ConnectionInfo();

        try
        {
            await InternalConnect(ci, remoteDebuggingUri);
        }
        catch (System.Exception ex)
        {
            if (ex.InnerException != null && object.ReferenceEquals(ex.InnerException.GetType(), typeof(System.Net.WebException)))
            {

                if (((System.Net.WebException)ex.InnerException).Status == System.Net.WebExceptionStatus.ConnectFailure)
                {
                    MasterDevs.ChromeDevTools.IChromeProcessFactory chromeProcessFactory =
                            new MasterDevs.ChromeDevTools.ChromeProcessFactory(new FastStubbornDirectoryCleaner(), chromePath);

                    // Create a durable process
                    MasterDevs.ChromeDevTools.IChromeProcess persistentChromeProcess = chromeProcessFactory.Create(9222, false);
                    // MasterDevs.ChromeDevTools.IChromeProcess persistentChromeProcess = chromeProcessFactory.Create(9222, true);

                    await InternalConnect(ci, remoteDebuggingUri);
                    return ci;
                } // End if (((System.Net.WebException)ex.InnerException).Status == System.Net.WebExceptionStatus.ConnectFailure)

            } // End if (ex.InnerException != null && object.ReferenceEquals(ex.InnerException.GetType(), typeof(System.Net.WebException)))

            System.Console.WriteLine(chromePath);
            System.Console.WriteLine(ex.Message);
            System.Console.WriteLine(ex.StackTrace);

            if (ex.InnerException != null)
            {
                System.Console.WriteLine(ex.InnerException.Message);
                System.Console.WriteLine(ex.InnerException.StackTrace);
            } // End if (ex.InnerException != null)

            System.Console.WriteLine(ex.GetType().FullName);

            throw;
        } // End Catch 

        return ci;
    } // End Function ConnectToChrome 

    public static async System.Threading.Tasks.Task ConvertDataAsync(ConversionData conversionData)
    {
        // AnySqlWebAdmin.SqlFactory SQL = new AnySqlWebAdmin.SqlFactory();

        MasterDevs.ChromeDevTools.IChromeSessionFactory chromeSessionFactory = new MasterDevs.ChromeDevTools.ChromeSessionFactory();
        // System.Random r = new System.Random();

        using (ConnectionInfo connectionInfo = await ConnectToChrome(conversionData.ChromePath, conversionData.RemoteDebuggingUri))
        {
            MasterDevs.ChromeDevTools.IChromeSession chromeSession = chromeSessionFactory.Create(connectionInfo.SessionInfo.WebSocketDebuggerUrl);

            // STEP 3 - Send a command
            //
            // Here we are sending a commands to tell chrome to set the viewport size 
            // and navigate to the specified URL
            await chromeSession.SendAsync(new SetDeviceMetricsOverrideCommand
            {
                Width = conversionData.ViewPortWidth,
                Height = conversionData.ViewPortHeight,
                Scale = 1
            });

            System.Threading.ManualResetEventSlim waitForBuildingsEncoded = new System.Threading.ManualResetEventSlim();

            // avoids the possible race condition between the page load initiated by startingUrl and Page.loadEventFired.
            // https://github.com/cyrus-and/chrome-remote-interface/issues/176
            // need to add runtime and dom, too, not just page  
            ICommandResponse pageEnableResult = await chromeSession.SendAsync<MasterDevs.ChromeDevTools.Protocol.Chrome.Page.EnableCommand>();
            System.Console.WriteLine("PageEnable: " + pageEnableResult.Id);

            ICommandResponse runtimeEnableResult = await chromeSession.SendAsync<MasterDevs.ChromeDevTools.Protocol.Chrome.Runtime.EnableCommand>();
            System.Console.WriteLine("RuntimeEnable: " + runtimeEnableResult.Id);

            ICommandResponse domEnableResult = await chromeSession.SendAsync<MasterDevs.ChromeDevTools.Protocol.Chrome.DOM.EnableCommand>();
            System.Console.WriteLine("DomEnable: " + domEnableResult.Id);

            MasterDevs.ChromeDevTools.CommandResponse<NavigateCommandResponse> navigateResponse =
                await chromeSession.SendAsync(new NavigateCommand
                {
                    // Url = "http://www.google.com"
                    Url = conversionData.Url
                });
            System.Console.WriteLine("NavigateResponse: " + navigateResponse.Id);

            chromeSession.Subscribe<LoadEventFiredEvent>(loadEventFired =>
            {
                System.Threading.Tasks.Task2.Run(async () =>
                {
                    System.Console.WriteLine("Loaded");

                    // int waitInterval = r.Next(14000, 20000); //for ints
                    // System.Threading.Thread.Sleep(waitInterval);
                    // await System.Threading.Tasks.Task2.Delay(4000);

                    await GetJson(conversionData, chromeSession);

                    System.Console.WriteLine("Task done");
                    waitForBuildingsEncoded.Set();
                });

            });

            chromeSession.Subscribe<LoadingFailedEvent>(loadEventFired =>
            {
                System.Console.WriteLine("failed");
            });

            waitForBuildingsEncoded.Wait(5000);
        } // End Using connectionInfo 

    } // End Sub ConvertDataAsync  

    public static async System.Threading.Tasks.Task GetJson(ConversionData conversionData, MasterDevs.ChromeDevTools.IChromeSession chromeSession)
    {
        // D:\username\Documents\Visual Studio 2017\Projects\TestPWA\TestPWA\wwwroot\Checklist2\ts\require\xpath.js
        string javaScriptToExecute = @"

""use strict""; function nsResolver(prefix) { var ns = { ""xhtml"": ""http://www.w3.org/1999/xhtml"", ""mathml"": ""http://www.w3.org/1998/Math/MathML"", ""svg"": ""http://www.w3.org/2000/svg"" }; return ns[prefix] || null; } function xpathSelector(path) { return document.evaluate(path, document, nsResolver, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue; } function xpathSelectorAll(xpathToExecute) { var result = []; var nodesSnapshot = document.evaluate(xpathToExecute, document, nsResolver, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null); for (var i = 0; i < nodesSnapshot.snapshotLength; i++) { result.push(nodesSnapshot.snapshotItem(i)); } return result; } function getZone(zn_uid) { return document.querySelector('path[data-uid=""' + zn_uid.toLowerCase() + '""]'); } function getCommunication(km_uid) { return document.querySelector('g[data-uid=""' + km_uid.toLowerCase() + '""]'); } function checkKeypair(km_uid, zn_uid) { var phone = getCommunication(km_uid); var zone = getZone(zn_uid); console.log(""pz"", phone, zone); } function getArray() { var allLayersExceptKMandZN = xpathSelectorAll(""/svg:svg/svg:g[not(@id='KM' or @id='ZN')]""); for (var i = 0; i < allLayersExceptKMandZN.length; ++i) { allLayersExceptKMandZN[i].parentNode.removeChild(allLayersExceptKMandZN[i]); } var kmObjects = Array.prototype.slice.call(document.querySelectorAll(""g#KM > g"")); for (var i = 0; i < kmObjects.length; ++i) { var style = kmObjects[i].getAttribute(""style""); if (style == null) style = """"; style += ""; pointer-events: none;""; kmObjects[i].setAttribute(""style"", style); } var objs = []; for (var i = 0; i < kmObjects.length; ++i) { var km_uid = kmObjects[i].getAttribute(""data-uid""); var rect = kmObjects[i].getBoundingClientRect(); var pointToCheck = { x: rect.x + rect.height / 2, y: rect.y + rect.width / 2 }; var zone = document.elementFromPoint(pointToCheck.x, pointToCheck.y); var layer = zone.parentElement ? zone.parentElement.id : null; if (layer == ""ZN"" && zone != null) { var zn_uid = zone.getAttribute(""data-uid""); objs.push({ ""km_uid"": km_uid, ""zn_uid"": zn_uid }); } else console.log(""error"", i, kmObjects[i]); } return objs; }

// console.log(JSON.stringify(getArray(), null, "" ""));

function foo() { return JSON.stringify(getArray(), null, "" ""); }

foo();

";

        CommandResponse<EvaluateCommandResponse> evr = await chromeSession.SendAsync(new EvaluateCommand
        {
            Expression = javaScriptToExecute,
            AwaitPromise = true // No effect 
            // ContextId = 123
        });

        if (evr.Result.ExceptionDetails != null)
            System.Console.WriteLine(evr.Result.ExceptionDetails);
        else
        {
            System.Console.WriteLine(evr.Result.Result.Value);
            System.Console.WriteLine("Success !");
            string fn = System.IO.Path.GetFileNameWithoutExtension(conversionData.Url);
            fn = System.Uri.UnescapeDataString(fn);

            string value = System.Convert.ToString(evr.Result.Result.Value, System.Globalization.CultureInfo.InvariantCulture);
            string bp = @"D:\SonovaZeich";
            fn = System.IO.Path.Combine(bp, fn + ".json");
            System.IO.File.WriteAllText(fn, value, System.Text.Encoding.UTF8);

        }

        // document.getElementById("resultat").innerHTML

        System.Console.WriteLine(evr.Result.Result.Value);

    }

    private static async System.Threading.Tasks.Task ClosePage(MasterDevs.ChromeDevTools.IChromeSession chromeSession, string frameId, bool headLess)
    {
        System.Threading.Tasks.Task<MasterDevs.ChromeDevTools.CommandResponse<CloseTargetCommandResponse>> closeTargetTask = chromeSession.SendAsync(
            new CloseTargetCommand()
            {
                TargetId = frameId
            }
        );

        // await will block forever if headless    
        if (!headLess)
        {
            MasterDevs.ChromeDevTools.CommandResponse<CloseTargetCommandResponse> closeTargetResponse = await closeTargetTask;
            System.Console.WriteLine(closeTargetResponse);
        }
        else
        {
            System.Console.WriteLine(closeTargetTask);
        }

    } // End Task ClosePage 

    public static void ConvertData(ConversionData conversionData)
    {
        ConvertDataAsync(conversionData).Wait();
    } // End Sub ConvertData 

} // End Class ChromiumBasedConverter 

}

ststeiger commented 1 year ago

For printing to PDF (vb.net)


Imports MasterDevs.ChromeDevTools
Imports MasterDevs.ChromeDevTools.Protocol.Chrome.Browser
Imports MasterDevs.ChromeDevTools.Protocol.Chrome.Page
Imports MasterDevs.ChromeDevTools.Protocol.Chrome.Target

Namespace Portal_Convert.CdpConverter

    Public Class ChromiumBasedConverter

        Private Delegate Function UnitConversion_t(ByVal value As Double) As Double

        Public Shared Sub KillHeadlessChromes(ByVal writer As System.IO.TextWriter)
            Dim allProcesses As System.Diagnostics.Process() = System.Diagnostics.Process.GetProcesses()
            Dim exeName As String = "\chrome.exe"

            If System.Environment.OSVersion.Platform = System.PlatformID.Unix Then
                exeName = "/chrome"
            End If

            For i As Integer = 0 To allProcesses.Length - 1
                Dim proc As System.Diagnostics.Process = allProcesses(i)
                Dim commandLine As String = ProcessUtils.GetCommandLine(proc)
                If String.IsNullOrEmpty(commandLine) Then Continue For
                commandLine = commandLine.ToLowerInvariant()
                If commandLine.IndexOf(exeName, System.StringComparison.InvariantCultureIgnoreCase) = -1 Then Continue For

                If commandLine.IndexOf("--headless", System.StringComparison.InvariantCultureIgnoreCase) <> -1 Then
                    writer.WriteLine($"Killing process {proc.Id} with command line ""{commandLine}""")
                    ProcessUtils.KillProcessAndChildren(proc.Id)
                End If
            Next

            writer.WriteLine($"Finished killing headless chromes")
        End Sub

        Public Shared Sub KillHeadlessChromes()
            KillHeadlessChromes(System.Console.Out)
        End Sub

        Private Shared Function __Assign(Of T)(ByRef target As T, value As T) As T
            target = value
            Return value
        End Function

        Public Shared Function KillHeadlessChromesWeb() As System.Collections.Generic.List(Of String)
            Dim ls As System.Collections.Generic.List(Of String) = New System.Collections.Generic.List(Of String)()
            Dim sb As System.Text.StringBuilder = New System.Text.StringBuilder()

            Using sw As System.IO.StringWriter = New System.IO.StringWriter(sb)
                KillHeadlessChromes(sw)
            End Using

            Using tr As System.IO.TextReader = New System.IO.StringReader(sb.ToString())
                Dim thisLine As String = Nothing

                While (__Assign(thisLine, tr.ReadLine())) IsNot Nothing
                    ls.Add(thisLine)
                End While
            End Using

            sb.Length = 0
            sb = Nothing
            Return ls
        End Function

        Private Shared Async Function InternalConnect(ByVal ci As ConnectionInfo, ByVal remoteDebuggingUri As String) As System.Threading.Tasks.Task
            ci.ChromeProcess = New RemoteChromeProcess(remoteDebuggingUri)
            ci.SessionInfo = Await ci.ChromeProcess.StartNewSession()
        End Function

        Private Shared Async Function ConnectToChrome(ByVal chromePath As String, ByVal remoteDebuggingUri As String) As System.Threading.Tasks.Task(Of ConnectionInfo)
            Dim ci As ConnectionInfo = New ConnectionInfo()

            Try
                Await InternalConnect(ci, remoteDebuggingUri)
            Catch ex As System.Exception

                If ex.InnerException IsNot Nothing AndAlso Object.ReferenceEquals(ex.InnerException.[GetType](), GetType(System.Net.WebException)) Then

                    If (CType(ex.InnerException, System.Net.WebException)).Status = System.Net.WebExceptionStatus.ConnectFailure Then
                        Dim chromeProcessFactory As MasterDevs.ChromeDevTools.IChromeProcessFactory = New MasterDevs.ChromeDevTools.ChromeProcessFactory(New FastStubbornDirectoryCleaner(), chromePath)
                        Dim persistentChromeProcess As MasterDevs.ChromeDevTools.IChromeProcess = chromeProcessFactory.Create(9222, True)

                        ' await cannot be used inside catch ...
                        ' Await InternalConnect(ci, remoteDebuggingUri)
                        InternalConnect(ci, remoteDebuggingUri).Wait()
                        Return ci
                    End If
                End If

                System.Console.WriteLine(chromePath)
                System.Console.WriteLine(ex.Message)
                System.Console.WriteLine(ex.StackTrace)

                If ex.InnerException IsNot Nothing Then
                    System.Console.WriteLine(ex.InnerException.Message)
                    System.Console.WriteLine(ex.InnerException.StackTrace)
                End If

                System.Console.WriteLine(ex.[GetType]().FullName)
                Throw
            End Try

            Return ci
        End Function

        Private Shared Async Function ClosePage(ByVal chromeSession As MasterDevs.ChromeDevTools.IChromeSession, ByVal frameId As String, ByVal headLess As Boolean) As System.Threading.Tasks.Task
            Dim closeTargetTask As System.Threading.Tasks.Task(Of MasterDevs.ChromeDevTools.CommandResponse(Of CloseTargetCommandResponse)) = chromeSession.SendAsync(New CloseTargetCommand() With {
                .TargetId = frameId
            })

            ' await will block forever if headless    
            If Not headLess Then
                Dim closeTargetResponse As MasterDevs.ChromeDevTools.CommandResponse(Of CloseTargetCommandResponse) = Await closeTargetTask
                System.Console.WriteLine(closeTargetResponse)
            Else
                System.Console.WriteLine(closeTargetTask)
            End If
        End Function

        Public Shared Async Function ConvertDataAsync(ByVal conversionData As ConversionData) As System.Threading.Tasks.Task
            Dim chromeSessionFactory As MasterDevs.ChromeDevTools.IChromeSessionFactory = New MasterDevs.ChromeDevTools.ChromeSessionFactory()

            Using connectionInfo As ConnectionInfo = Await ConnectToChrome(conversionData.ChromePath, conversionData.RemoteDebuggingUri)
                Dim chromeSession As MasterDevs.ChromeDevTools.IChromeSession = chromeSessionFactory.Create(connectionInfo.SessionInfo.WebSocketDebuggerUrl)

                Await chromeSession.SendAsync(New SetDeviceMetricsOverrideCommand With {
                    .Width = conversionData.ViewPortWidth,
                    .Height = conversionData.ViewPortHeight,
                    .Scale = 1
                })

                Dim navigateResponse As MasterDevs.ChromeDevTools.CommandResponse(Of NavigateCommandResponse) = Await chromeSession.SendAsync(New NavigateCommand With {
                    .Url = "about:blank"
                })

                System.Console.WriteLine("NavigateResponse: " & navigateResponse.Id)
                Dim setContentResponse As MasterDevs.ChromeDevTools.CommandResponse(Of SetDocumentContentCommandResponse) = Await chromeSession.SendAsync(New SetDocumentContentCommand() With {
                    .FrameId = navigateResponse.Result.FrameId,
                    .Html = conversionData.Html
                })

                Dim cm2inch As UnitConversion_t = Function(ByVal centimeters As Double) centimeters * 0.393701
                Dim mm2inch As UnitConversion_t = Function(ByVal milimeters As Double) milimeters * 0.0393701

                Dim printCommand2 As PrintToPDFCommand = New PrintToPDFCommand() With {
                    .Scale = 1,
                    .MarginTop = 0,
                    .MarginLeft = 0,
                    .MarginRight = 0,
                    .MarginBottom = 0,
                    .PrintBackground = True,
                    .Landscape = False,
                    .PaperWidth = mm2inch(conversionData.PageWidth),
                    .PaperHeight = mm2inch(conversionData.PageHeight) ' 
                }

                '.PaperWidth = cm2inch(conversionData.PageWidth),
                '.PaperHeight = cm2inch(conversionData.PageHeight)

                If conversionData.ChromiumActions.HasFlag(ChromiumActions_t.GetVersion) Then

                    Try
                        System.Diagnostics.Debug.WriteLine("Getting browser-version")
                        Dim version As MasterDevs.ChromeDevTools.CommandResponse(Of GetVersionCommandResponse) = Await chromeSession.SendAsync(New GetVersionCommand())
                        System.Diagnostics.Debug.WriteLine("Got browser-version")
                        conversionData.Version = version.Result
                    Catch ex As System.Exception
                        conversionData.Exception = ex
                        System.Diagnostics.Debug.WriteLine(ex.Message)
                    End Try
                End If

                If conversionData.ChromiumActions.HasFlag(ChromiumActions_t.ConvertToImage) Then

                    Try
                        System.Diagnostics.Debug.WriteLine("Taking screenshot")
                        Dim screenshot As MasterDevs.ChromeDevTools.CommandResponse(Of CaptureScreenshotCommandResponse) = Await chromeSession.SendAsync(New CaptureScreenshotCommand With {
                            .Format = "png"
                        })
                        System.Diagnostics.Debug.WriteLine("Screenshot taken.")
                        conversionData.PngData = System.Convert.FromBase64String(screenshot.Result.Data)
                    Catch ex As System.Exception
                        conversionData.Exception = ex
                        System.Diagnostics.Debug.WriteLine(ex.Message)
                    End Try
                End If

                If conversionData.ChromiumActions.HasFlag(ChromiumActions_t.ConvertToPdf) Then

                    Try
                        System.Diagnostics.Debug.WriteLine("Printing PDF")
                        Dim pdf As MasterDevs.ChromeDevTools.CommandResponse(Of PrintToPDFCommandResponse) = Await chromeSession.SendAsync(printCommand2)
                        System.Diagnostics.Debug.WriteLine("PDF printed.")
                        conversionData.PdfData = System.Convert.FromBase64String(pdf.Result.Data)
                    Catch ex As System.Exception
                        conversionData.Exception = ex
                        System.Diagnostics.Debug.WriteLine(ex.Message)
                    End Try
                End If

                System.Console.WriteLine("Closing page")
                Await ClosePage(chromeSession, navigateResponse.Result.FrameId, True)
                System.Console.WriteLine("Page closed")

            End Using ' connectionInfo

        End Function ' ConvertDataAsync

        Public Shared Sub ConvertData(ByVal conversionData As ConversionData)
            ConvertDataAsync(conversionData).Wait()
        End Sub

    End Class

End Namespace
ststeiger commented 1 year ago

Equivalent C#-Code:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Reflection;
using System.Runtime.CompilerServices;
using System.Security;
using System.Text;
using System.Threading.Tasks;
using Microsoft.VisualBasic;
using MasterDevs.ChromeDevTools;
using MasterDevs.ChromeDevTools.Protocol.Chrome.Browser;
using MasterDevs.ChromeDevTools.Protocol.Chrome.Page;
using MasterDevs.ChromeDevTools.Protocol.Chrome.Target;

namespace Portal_Convert.CdpConverter
{
    public class ChromiumBasedConverter
    {
        private delegate double UnitConversion_t(double value);

        public static void KillHeadlessChromes(System.IO.TextWriter writer)
        {
            System.Diagnostics.Process[] allProcesses = System.Diagnostics.Process.GetProcesses();
            string exeName = @"\chrome.exe";

            if (System.Environment.OSVersion.Platform == System.PlatformID.Unix)
                exeName = "/chrome";

            for (int i = 0; i <= allProcesses.Length - 1; i++)
            {
                System.Diagnostics.Process proc = allProcesses[i];
                string commandLine = ProcessUtils.GetCommandLine(proc);
                if (string.IsNullOrEmpty(commandLine))
                    continue;
                commandLine = commandLine.ToLowerInvariant();
                if (commandLine.IndexOf(exeName, System.StringComparison.InvariantCultureIgnoreCase) == -1)
                    continue;

                if (commandLine.IndexOf("--headless", System.StringComparison.InvariantCultureIgnoreCase) != -1)
                {
                    writer.WriteLine($"Killing process {proc.Id} with command line ""{commandLine}""");
                    ProcessUtils.KillProcessAndChildren(proc.Id);
                }
            }

            writer.WriteLine($"Finished killing headless chromes");
        }

        public static void KillHeadlessChromes()
        {
            KillHeadlessChromes(System.Console.Out);
        }

        private static T __Assign<T>(ref T target, T value)
        {
            target = value;
            return value;
        }

        public static System.Collections.Generic.List<string> KillHeadlessChromesWeb()
        {
            System.Collections.Generic.List<string> ls = new System.Collections.Generic.List<string>();
            System.Text.StringBuilder sb = new System.Text.StringBuilder();

            using (System.IO.StringWriter sw = new System.IO.StringWriter(sb))
            {
                KillHeadlessChromes(sw);
            }

            using (System.IO.TextReader tr = new System.IO.StringReader(sb.ToString()))
            {
                string thisLine = null;

                while ((__Assign(ref thisLine, tr.ReadLine())) != null)
                    ls.Add(thisLine);
            }

            sb.Length = 0;
            sb = null;
            return ls;
        }

        private static async System.Threading.Tasks.Task InternalConnect(ConnectionInfo ci, string remoteDebuggingUri)
        {
            ci.ChromeProcess = new RemoteChromeProcess(remoteDebuggingUri);
            ci.SessionInfo = await ci.ChromeProcess.StartNewSession();
        }

        private static async System.Threading.Tasks.Task<ConnectionInfo> ConnectToChrome(string chromePath, string remoteDebuggingUri)
        {
            ConnectionInfo ci = new ConnectionInfo();

            try
            {
                await InternalConnect(ci, remoteDebuggingUri);
            }
            catch (Exception ex)
            {
                if (ex.InnerException != null && object.ReferenceEquals(ex.InnerException.GetType(), typeof(System.Net.WebException)))
                {
                    if (((System.Net.WebException)ex.InnerException).Status == System.Net.WebExceptionStatus.ConnectFailure)
                    {
                        MasterDevs.ChromeDevTools.IChromeProcessFactory chromeProcessFactory = new MasterDevs.ChromeDevTools.ChromeProcessFactory(new FastStubbornDirectoryCleaner(), chromePath);
                        MasterDevs.ChromeDevTools.IChromeProcess persistentChromeProcess = chromeProcessFactory.Create(9222, true);

                        // await cannot be used inside catch ...
                        // Await InternalConnect(ci, remoteDebuggingUri)
                        InternalConnect(ci, remoteDebuggingUri).Wait();
                        return ci;
                    }
                }

                System.Console.WriteLine(chromePath);
                System.Console.WriteLine(ex.Message);
                System.Console.WriteLine(ex.StackTrace);

                if (ex.InnerException != null)
                {
                    System.Console.WriteLine(ex.InnerException.Message);
                    System.Console.WriteLine(ex.InnerException.StackTrace);
                }

                System.Console.WriteLine(ex.GetType().FullName);
                throw;
            }

            return ci;
        }

        private static async System.Threading.Tasks.Task ClosePage(MasterDevs.ChromeDevTools.IChromeSession chromeSession, string frameId, bool headLess)
        {
            System.Threading.Tasks.Task<MasterDevs.ChromeDevTools.CommandResponse<CloseTargetCommandResponse>> closeTargetTask = chromeSession.SendAsync(new CloseTargetCommand()
            {
                TargetId = frameId
            });

            // await will block forever if headless    
            if (!headLess)
            {
                MasterDevs.ChromeDevTools.CommandResponse<CloseTargetCommandResponse> closeTargetResponse = await closeTargetTask;
                System.Console.WriteLine(closeTargetResponse);
            }
            else
                System.Console.WriteLine(closeTargetTask);
        }

        public static async System.Threading.Tasks.Task ConvertDataAsync(ConversionData conversionData)
        {
            MasterDevs.ChromeDevTools.IChromeSessionFactory chromeSessionFactory = new MasterDevs.ChromeDevTools.ChromeSessionFactory();

            using (ConnectionInfo connectionInfo = await ConnectToChrome(conversionData.ChromePath, conversionData.RemoteDebuggingUri))
            {
                MasterDevs.ChromeDevTools.IChromeSession chromeSession = chromeSessionFactory.Create(connectionInfo.SessionInfo.WebSocketDebuggerUrl);

                await chromeSession.SendAsync(new SetDeviceMetricsOverrideCommand()
                {
                    Width = conversionData.ViewPortWidth,
                    Height = conversionData.ViewPortHeight,
                    Scale = 1
                });

                MasterDevs.ChromeDevTools.CommandResponse<NavigateCommandResponse> navigateResponse = await chromeSession.SendAsync(new NavigateCommand()
                {
                    Url = "about:blank"
                });

                System.Console.WriteLine("NavigateResponse: " + navigateResponse.Id);
                MasterDevs.ChromeDevTools.CommandResponse<SetDocumentContentCommandResponse> setContentResponse = await chromeSession.SendAsync(new SetDocumentContentCommand()
                {
                    FrameId = navigateResponse.Result.FrameId,
                    Html = conversionData.Html
                });

                UnitConversion_t cm2inch = double centimeters => centimeters * 0.393701;
                UnitConversion_t mm2inch = double milimeters => milimeters * 0.0393701;

                PrintToPDFCommand printCommand2 = new PrintToPDFCommand()
                {
                    Scale = 1,
                    MarginTop = 0,
                    MarginLeft = 0,
                    MarginRight = 0,
                    MarginBottom = 0,
                    PrintBackground = true,
                    Landscape = false,
                    PaperWidth = mm2inch(conversionData.PageWidth),
                    PaperHeight = mm2inch(conversionData.PageHeight) // 
                };

                // .PaperWidth = cm2inch(conversionData.PageWidth),
                // .PaperHeight = cm2inch(conversionData.PageHeight)

                if (conversionData.ChromiumActions.HasFlag(ChromiumActions_t.GetVersion))
                {
                    try
                    {
                        System.Diagnostics.Debug.WriteLine("Getting browser-version");
                        MasterDevs.ChromeDevTools.CommandResponse<GetVersionCommandResponse> version = await chromeSession.SendAsync(new GetVersionCommand());
                        System.Diagnostics.Debug.WriteLine("Got browser-version");
                        conversionData.Version = version.Result;
                    }
                    catch (Exception ex)
                    {
                        conversionData.Exception = ex;
                        System.Diagnostics.Debug.WriteLine(ex.Message);
                    }
                }

                if (conversionData.ChromiumActions.HasFlag(ChromiumActions_t.ConvertToImage))
                {
                    try
                    {
                        System.Diagnostics.Debug.WriteLine("Taking screenshot");
                        MasterDevs.ChromeDevTools.CommandResponse<CaptureScreenshotCommandResponse> screenshot = await chromeSession.SendAsync(new CaptureScreenshotCommand()
                        {
                            Format = "png"
                        });
                        System.Diagnostics.Debug.WriteLine("Screenshot taken.");
                        conversionData.PngData = System.Convert.FromBase64String(screenshot.Result.Data);
                    }
                    catch (Exception ex)
                    {
                        conversionData.Exception = ex;
                        System.Diagnostics.Debug.WriteLine(ex.Message);
                    }
                }

                if (conversionData.ChromiumActions.HasFlag(ChromiumActions_t.ConvertToPdf))
                {
                    try
                    {
                        System.Diagnostics.Debug.WriteLine("Printing PDF");
                        MasterDevs.ChromeDevTools.CommandResponse<PrintToPDFCommandResponse> pdf = await chromeSession.SendAsync(printCommand2);
                        System.Diagnostics.Debug.WriteLine("PDF printed.");
                        conversionData.PdfData = System.Convert.FromBase64String(pdf.Result.Data);
                    }
                    catch (Exception ex)
                    {
                        conversionData.Exception = ex;
                        System.Diagnostics.Debug.WriteLine(ex.Message);
                    }
                }

                System.Console.WriteLine("Closing page");
                await ClosePage(chromeSession, navigateResponse.Result.FrameId, true);
                System.Console.WriteLine("Page closed");
            } // connectionInfo
        } // ConvertDataAsync

        public static void ConvertData(ConversionData conversionData)
        {
            ConvertDataAsync(conversionData).Wait();
        }
    }
}
levicki commented 1 year ago

@ststeiger Thank you very much for the examples. I am not really a big fan of using Chrome in that manner but I will look into it because as you said QT Webkit is ancient and does not really render everything perfectly. Maybe I should also investigate Microsoft Edge WebView2 control and see if something like that is possible with it.

levicki commented 1 year ago

@ststeiger I also implemented a version with WebView2, however that version does PDF printing and unlike WebKit does not preserve document bookmarks.