HakanL / WkHtmlToPdf-DotNet

C# .NET Core wrapper for wkhtmltopdf library that uses Webkit engine to convert HTML pages to PDF.
GNU Lesser General Public License v3.0
367 stars 66 forks source link

Incorrect character encoding #65

Closed SerekWiejski closed 2 years ago

SerekWiejski commented 2 years ago

Hello. I try to convert html to pdf. Html has charset set to iso-8859-2. Converted pdf should looks like Zażółć gęślą jaźń, but it looks like this Za���� g��lďż˝ ja�� Steps to reproduce Use following code to generate pdf

private void Convert()
        {
            var html = "<html><head><meta http-equiv=\"Content-Type\" content=\"text/html;charset=iso-8859-2\"><body>Zażółć gęślą jaźń</body></head></html>";
            using (var tools = new PdfTools())
            using (var converter = new SynchronizedConverter(tools))
            {
                var doc = new HtmlToPdfDocument()
                {
                    GlobalSettings = {
                        ColorMode = ColorMode.Color,
                        Orientation = Orientation.Portrait,
                        PaperSize = PaperKind.A4,
                        DPI = 300
                    },
                    Objects = {
                        new ObjectSettings() {
                            HtmlContent = html,
                            Encoding = Encoding.GetEncoding(28592),  //iso-8859-2
                            WebSettings = { DefaultEncoding = Encoding.GetEncoding(28592).WebName }
                        }
                    }
                };

                var bytes = converter.Convert(doc);
                File.WriteAllBytes("document.pdf", bytes);                
            }
        }
HakanL commented 2 years ago

I don't think this wrapper package does anything with the encoding, can you test this with the native library and see if that works?

SerekWiejski commented 2 years ago

It works when I set Page property in ObjectsSettings. The html file has the encoding set to iso-8859-2.

HakanL commented 2 years ago

Try it in the native library to see if it's an issue in that, or in the wrapper. If you find a fix then we're happy to accept a PR.

SerekWiejski commented 2 years ago

I tried in native. It's not a bug. DefaultEncoding only matters for content located at Page properties.