PhilterPaper / Perl-PDF-Builder

Extended version of the popular PDF::API2 Perl-based PDF library for creating, reading, and modifying PDF documents
https://www.catskilltech.com/FreeSW/product/PDF%2DBuilder/title/PDF%3A%3ABuilder/freeSW_full
Other
6 stars 7 forks source link

[CTS 14] Handling SVG? #89

Closed PhilterPaper closed 1 month ago

PhilterPaper commented 6 years ago

PDF::Builder currently handles several image formats (GIF, JPEG, PNG, TIFF, etc.), any of which can be dynamically produced (on the fly) for a web page. Another increasingly popular format is SVG (Scalable Vector Graphics), which can be easily produced on the fly, and displayed by a browser. Would it be worthwhile to handle SVG graphics in the same manner as other graphics? The basic SVG commands are fairly simple to parse into PDF drawing commands, but SVG can also embed images within its codes, adding a layer of complication. The alternative could be to use an external utility to convert SVG content into an already-supported image format, but this would require that the user install some external program(s). Another choice could be to add hooks to permit invoking a user-supplied command line utility to convert SVG (or an arbitrary image format) into a supported image format.

Something like an existing file with SVG or image content (and SVG possibly embedded yet another image file) could either be read in or processed externally, while in-line SVG command text might be directly parsed by image_svg(). This could also pull in image files. Anyway, there are many combinations and possibilities for SVG support, that could be direct support (convert to PDF primitives) plus embedded images, or that SVG is externally converted to a supported image format. A program wishing to output PDF might also dynamically generate SVG for document graphics, in lieu of using PDF primitives directly, such as generating content suitable for browser (HTML) or PDF output.

PhilterPaper commented 5 years ago

Just a note, prompted by wkHTMLtoPDF problem with SVG being rasterized. SVG input should preferably not be rasterized, but should be transformed into PDF graphics primitives. The idea is that such graphics should cleanly scale to all sizes.

I need to investigate whether PDF graphics are a subset or superset of SVG graphics, and whether there is anything in SVG that would be a problem to translate to PDF graphics.

PhilterPaper commented 5 years ago

I just spent a number of hours digging through SVG, and it's huge — almost as big as PDF itself! Unless I can find a prewritten Perl library to parse arbitrary SVG code, it may just be too large a task to try to support all of SVG. SVG is supposed to be formatted as XML, so an XML-processing library may at least be able to do the task of parsing (and syntax checking) the SVG, leaving me to traverse the DOM and implement (at least) selected parts of it (outputting PDF graphics, text, and image primitives). The worst case would be to hand-code a parser for a small subset of SVG, to take care of the most common things. The challenge here is to later extend it (without breaking something) when someone says they desperately need the transparent bleems function… could I please tack that on?

There's stuff that PDF may not implement (many of the filters), and all the interactive stuff (e.g., <a> links) and animation- and sound-related things can probably be safely omitted. I think it would be sufficient to end up with a static "image" just like GIF, JPEG, PNG, TIFF, etc. The W3Schools SVG tutorial covers enough of SVG that supporting that much could be enough to get a useful product. On the other hand, the official W3C definition is a staggering amount of work.

PhilterPaper commented 3 years ago

See CTS 10 (#99) for thoughts on using SVG instead of extending "regular" calls with relative coordinate versions.

PhilterPaper commented 3 years ago

Pages 510-511 of the PDF 1.7 spec show a SlideShow with embedded SVG file. It seems to imply that it's possible that a PDF Reader can handle SVG natively. Be sure to look at that before putting any more effort into parsing an SVG and reproducing using PDF graphics and text primitives.

Add: possible false alarm. This PDF "slideshow" is the only place SVG is mentioned, and it is starting to sound like a "slideshow" is using external utilities to display various formats, including SVG (possibly using a browser, as JavaScript is mentioned). It looks like the SVG file is merely carried along (embedded) to hand over to the browser, but I will check to make sure.

PhilterPaper commented 3 years ago

@sciurius has requested SVG vector support on PDF::API2 (RT 134780). I'm not going to get SVG out in PDF::Builder 3.022 (out probably within a week or two), but since there's interest, maybe that will get me moving faster on SVG support (already partly implemented). I need to get a bit further along before I put it on GitHub, but if anyone has interest in joining in, I can get it out a bit earlier.

PhilterPaper commented 3 years ago

Depending on how big the SVG code ends up being, I'm considering packaging it as its own CPAN module. It would in turn call three other packages (overall XML parser SVG::Parser, "path" decode Image::SVG::Path, and "transform" decode Image::SVG::Transform) and return an array of standardized, low level graphics and text calls, including all the applicable attributes at each call. The PDF SVG support would then output these primitives as PDF graphics and text calls. The new package would not be exclusively for PDF, but could be used in a wide range of applications.

I'm still fairly early in the design and writing process, so this architecture could change considerably. I'd like to get feedback on what others think of the idea. I don't know if I can support 100% of the SVG spec, but I will try to get in all of the SVG into the returned array, and on the PDF side handle most of the non-interactive static visual content. PDF may have to output a single object rather than outputting to graphics and text objects, so that items are rendered in the desired order. The input to the SVG package would be either a file path or a string of SVG code (so you could create an image on the fly).

PhilterPaper commented 3 years ago

My current prototype/experiment uses SVG::Parser, but that's not cast in stone. I've seen a number of SVG-parsing packages floating around that use all sorts of XML parsers, and I'm open at this point to changing over if you can give some good arguments to do so. I do want to avoid prereq'ing a huge number of other packages, as some of these modules do. I also want to have something very portable that doesn't require a (for example) Python or R installation, things which don't come with a Windows box. Keep it fairly lightweight.

Whether or not I offer a Perl-based SVG parser as a separate module (described previously), internally it will use that structure to break down the SVG into a list of primitives and associated data and settings/CSS. That will be fed to something that generates the matching PDF primitives for output. If anyone has seen Perl-based support SVG parsers that largely or entirely handle this already, I'd appreciate hearing about them!

PhilterPaper commented 3 years ago

Mon May 17 16:38:22 2021 PMPERRY@cpan.org - Correspondence added (to RT 134780)

On Sun Mar 21 17:12:23 2021, JV wrote:

See PhilterPaper/Perl-PDF-Builder/issues/89 for my work on this subject.

Great to hear there's already work being done on this.

I expect to fairly soon (mid June?) be able to devote some serious time to this. Please read and start giving your thoughts on the GitHub issue (89) listed above. It would be nice to have input from people who have dived into SVG processing and are willing to share their experiences, while I'm still early in the architecting of this thing. Thanks!

sciurius commented 3 years ago

Closest I can come is a svg renderer written in python. It renders the SVG on the canvas. This SVG renderer is developed to render the output of abc2svg only and may (will) fail with arbitrary SVG. Since rendering abc2svg output is exactly what I want to achieve I've been considering (and still do) to transcribe this python renderer to a perl module. I'm not sure to what extent this will be helpful for rendering arbitrary SVG.

PhilterPaper commented 3 years ago

Well, if there are good algorithms for transcribing SVG into PDF primitives, I'd be happy to hear about them. The first stage is to parse an arbitrary SVG file into some level of primitives, and the second stage is to build PDF from that list. I'm currently using SVG::Parser, but someone (it might have been you) complained that it was a barely warmed-over XML Parser. Nevertheless, I'd like to avoid re-inventing the wheel, and use as much existing library code as possible to parse the SVG. If there's some library more up-to-date and better maintained, I would consider using it. I hope by mid to late June to have a first cut of the SVG parser on GitHub for you and others to play with. I'm going to wait until that's done to decide whether to make it a separate project and CPAN release.

PhilterPaper commented 3 years ago

I finally found my preliminary development work on SVG conversion (I thought I had lost it when a PDF::Builder install overwrote the files, but I did turn out to have it on backups!). Release 0.001 is on GitHub under PhilterPaper/Perl-SVG-Reader. There will not be a CPAN release until 1.000 is ready (along with PDF::Builder SVG image support, using SVG::Reader).

At this point, I am assuming that I will keep this code in that repository, and release a CPAN package at release 1.000. However, if it turns out that the code is lightweight and trivial enough, and no one is trying to use it for their own projects, I may choose to fold it into PDF::Builder. I don't know at this point -- I want to wait to find out how big a project this is.

This first commit is very basic, and only starts the process. I still have a lot more work to do on it, but want people to see its progress. If I can spare time from other projects, I will work on it at a fairly steady pace. I hope by the end of July (or mid August) to have it largely "there". At some point, I need to write the code in PDF::Builder that actually makes use of this SVG::Reader. This package will not be released until it appears to be usable for PDF::Builder's SVG image support.

Issues involving PDF::Builder's SVG support should be opened under this repository, but issues involving the new library should be under Perl-SVG-Reader. I look forward to comments and suggestions (and even code) from others!

sciurius commented 3 years ago

Good job!

PhilterPaper commented 2 years ago

I ran into a mess with SVG::Parser, in which it seems to randomly pick which library (Expat or SAX) it uses, based on exactly how it's invoked, resulting in somewhat differently structured hashes. I think I can work through finding which one is produced, and properly digesting its data, but it's still a nuisance. My query to SVG::Parser's ticket system (https://rt.cpan.org/Public/Bug/Display.html?id=138495) is still unanswered, so I'm becoming concerned that this product is unsupported. Does anyone have a suggestion on a better parser to use for SVG, possibly a more generic XML parser? It should produce something similar to SVG::Parser, and be supported!

PhilterPaper commented 1 year ago

Not SVG itself, but related: consider support for HP-GL pen plotter (vector) "image" input. Presumably pen plotters are still in use for large drawings, so some sort of viewport and/or scaling will likely be necessary. Rendering as vector graphics would permit unlimited zooming in to look at details. No one in their right mind is going to hand-write HP-GL diagrams, charts, etc., but there still may be plenty of programs that produce it as output, and it would be nice to be able to publish as PDF, even if you don't have a physical pen plotter.

sciurius commented 1 year ago

HP-GL is dead simple to implement, it is just a series of move and draw instructions. Whether it would be useful? I can't say...

PhilterPaper commented 1 year ago

Looking at my HP-GL/2 Reference Guide, I see that the plotter language is very complex, approaching that of SVG and PDF. There's lots of commands for text (with downloadable fonts) and absolute/relative coordinate versions of almost everything. I don't think a full implementation of HP-GL would be worth the effort, but a healthy subset might be useful for someone who has utilities or programs that create HP-GL output.

I'm thinking of using a common "generic vector graphics" routine to handle the output of an HP-GL reader, a barcode routine (see #48), and possibly even the SVG reader. It would then output PDF primitives. Something to think about, anyway. Initially, it would support library routines for barcodes and maybe a small subset of HP-GL, and would be expanded over time to handle the output of a reader for a reasonable subset of SVG. Something like that. If and when PDF::Builder users ask for more vector graphics capabilities, the appropriate reader(s) and the generic vector graphics routine/library could be enhanced.

sciurius commented 1 year ago

Oh yes, it is a lot of commands, but they all seem rather straightforward. The complicating factor for SVG it its support for CSS3, which is a hell to implement. (I tried.)

PhilterPaper commented 1 year ago

Let me think about making HP-GL/2 (or at least, a large subset) the "generic" vector plot language. It ought to do nicely for barcodes, but I'm not sure about SVG. And of course, it would be directly usable for anyone who wants to use HP-GL as some sort of vector plotting language. I don't see trying to support HP's PJL, PCL, or RTL, unless there's a huge demand for it.

I've got a ton of stuff on my plate right now, including SVG support for standalone graphics, MathJax equation support, and Gnuplot graphing support, in addition to extending column() HTML support. If it looks like an HP-GL/2-to-PDF graphics function (Perl image_hpgl()) would be a good base for a large subset of SVG, I would welcome code contributions from the community! Get fame and fortune in the OSS arena (well, local fame anyway) with a well-defined, limited scope project that needs doing.

sciurius commented 1 year ago

How about a subset of SVG that can 'drive' HPGL?

Last year I've been working on an SVG module and got reasonably far. Unfortunately the people that write SVG generating programs like to incorporate more and more CSS3 features, making a complete(r) implementation impossible. It would be like rewriting half of Firefox in Perl.

I'm quite busy now, but I may take up the SVG module later this year.

PhilterPaper commented 1 year ago

If you're asking about a translator from SVG to an intermediate form such as HPGL, and then directly "interpret" the HPGL (into PDF primitives), that's along the lines of what I was thinking of doing. It is probably not feasible to try to support the entire SVG definition, but I think it's possible to come up with a reasonable subset that would prove widely useful. If users ask for unimplemented features (such as embedded raster images or specific CSS), we can consider adding them piece-by-piece, so long as the original architecture was flexible enough to allow that. That's what I've tried to do with the column() HTML/CSS implementation.

The whole point is to have scalable vector graphics, rather than raster images, to embed into a PDF document. I'm not sure if the HPGL interpreter image_hpgl() should produce an object of some sort (like the other image_ routines, to be fed to the image() call) or just directly output to the graphics context object (in which case the name perhaps should not be image_).

A few years ago I started playing with a Perl implementation to translate SVG into some intermediate form, but didn't get far before being distracted by more pressing matters. I would be happy to let someone take the lead with resuming work on SVG::Reader (outputting HPGL code) and even working on an HPGL-to-PDF "interpreter" to complete the job. I can retain ownership of SVG::Reader, or hand it over to someone else who wants it. The HPGL-to-PDF interpreter would be contributed to PDF::Builder.

       SVG graphics file,
       MathJax output,
       Gnuplot output                   direct plotter-style graphing
       ==============                   =============================
 SVG input: string or file              HP-GL/2 input: string or file              bar code request: string
             |                                         |                                      |
             v                                         |                                      v
        SVG::Reader *                                  |                              Graphics::BarCode *
             |                                         |                                      |
             v                                         |                                      v
         HP-GL string                                  |                                HP-GL string
             |                                         |                                      |
             +-------------------------------------+   |   +----------------------------------+
                                                   |   |   |
                                                   v   v   v                          * new package
                                                 imageX_hpgl()
                                                       |
                                                       v
                           PDF vector (and text?) primitives into graphics context

The idea behind having a Graphics::BarCode package would be to output a wide range of bar codes in a neutral format, here eventually rendered as PDF, while other uses might be other graphics formats such as GIF, GD, etc. There are already many bar code packages, but each directly outputs in some specific format not necessarily usable by something such as PDF::Builder.

PhilterPaper commented 1 year ago

@sciurius , if you (or anyone else) wants to pick up the SVG processing task and/or the HPGL processing task, please let me know (even if it's just wanting to do design work on it at this point). I want to avoid duplication of effort and one of us being disappointed by having their hard work discarded. Not that I'm near ready to do active work on SVG and HPGL (which would need to be coordinated so that the HPGL processor can handle everything SVG does), but some time this year I hope to get back to it.

By the way, regarding HPGL, there is a lot of stuff in there for handling text (fonts) for output by the LB label command. I may need to trim it down to a reasonable subset. Apparently, an HP plotter can handle two fonts at any given time -- a standard or primary font, and an alternate or secondary font. There are commands to switch between them, and you can use SO and SI to select fonts within a single LaBel string. There seems to be some limited multibyte capability, though I'm not sure if UTF-8 is supported (I think 8 and 16 bit characters are). Anyway, someone with a solid understanding of SVG will need to work with whoever is doing HPGL to coordinate things (and if necessary, do separate PDF output for SVG if HPGL is insufficient).

Needless to say, both SVG and HPGL support are likely to be only subsets of the full languages for quite some time. We should endeavor to make the architectures flexible enough to be able to add additional features in the future, getting closer to full support of the definition.

PhilterPaper commented 1 year ago

Three things:

  1. It would be good for both SVG and HPGL routines to return the dimensions of the produced image, before any ink is put down, so that placement on the page can be adjusted. This is especially important for inline SVG renderings, such as inline MathJax equations.
  2. It would be good to keep the SVG and/or HPGL PDF code as its own object, so that it doesn't bloat the size of the graphics context object, and may be handled as an independent object (see 1., without having to call the renderer twice, first to get the dimensions and second to put down ink). A raster image does this (GIF, JPEG, etc.), but I'm not sure if there's a way for a graphics stream (or an ET/graphics/BT dropout) to "call" another graphics stream object. Any ideas? I'll have to see if that's what raster graphics handling more or less already does (as an XObject).
  3. If it doesn't already support this in some way, SVG may have to be extended in some way (new tag(s)) to define good "break points" for a long equation to better fit on a line. This might be done manually when writing SVG or equation input to MathJax, or it might be something returned by the renderer, such that it says "I think that here are some good places to break up the line, and the length of each piece". That way, an equation might be better fit into a text line. See https://groups.google.com/g/mathjax-users/c/A00O2y4KgyQ/m/2nzj2mXmAAAJ for some thoughts on this.
sciurius commented 1 year ago

There are several tools that claim to turn HPGL into SVG. I don't know any of them but it they do the job it is better for us to concentrate on SVG.

I will definitely pick up the SVG module I've been working on last year. As I said earlier the graphics handling (drawing instructions) is pretty much functional but it is CSS3 that makes it hard.

My personal goal is to be able to deal with the output of some SVG generating tools that I need for one of my projects. It may be of (more or less limited) general use but that will be a fortunate sideeffect.

Establishing the bounding box without painting is trivial if the SVG has an accurate drawing box. If not, my approach would be to draw the image in an XObject so you have the liberty to move/scale/etc the result depending on its dimensions.

As SVG is a graphics language, it is content agnostic. Breaking long (formula) lines is the task of the tool that formats the equation and produces SVG. I don;t think this can be done afterwards.

PhilterPaper commented 1 year ago

It would be fine by me to reverse things and translate HPGL to SVG (using some external utility, or a Perl library if there is one). Barcode output would then be directly to SVG rather than to HPGL. MathJax SVG output doesn't use text output, but renders each character as strokes and curves. So, even a reasonably simple SVG-to-PDF renderer should be able to handle MathJax and Barcodes. I don't know yet what Gnuplot needs for SVG, but it can also do raster image output (e.g., PNG), so that's not a show-stopper. Naturally, as the SVG processor is developed, it should be tested against some samples from all three inputs.

My original plans for SVG::Reader were to output some sort of "generic" graphics output, which would then be rendered by PDF::Builder to PDF primitives (producing scalable vector graphics -- that's the critical point). Are you thinking of producing a library module that could be used by either PDF::API2 or PDF::Builder to accept SVG input and create the appropriate PDF output? If so, I can probably scrap (with a sigh of relief) what I've done for SVG::Reader. Don't forget that PDF::Builder's FontManager might prove useful for handling text, relieving you of all the font busy-work that might be needed for text. However, you might have to handle all that yourself, if this library is to also work with PDF::API2.

I have no idea how much CSS support will be needed for my applications (some, I'm sure). I don't think I need any raster graphics support within SVG, but some text will be needed. I can supply for testing a bunch of SVGs for all three sources (four, if I can get an HPGL-to-SVG converter working). I like the idea of producing an XObject which is scaled and positioned from within the graphics context.

sciurius commented 1 year ago

For the purpose of generating barcodes and (as I assume) MathJax and gnuplot SVG I think what I have now is already sufficient. Unfortunately I left it last year in a non-functional state (I screwed up the CSS handling) so I'd have to get it going again.

For the font handling, my intention was to integrate with Text::Layout::FontManager but I left this for a later phase. It would be nice if PDF::Builder::FontManager could be a separate addon, usable with PDF::API2 as well.

sciurius commented 1 year ago

Feel free to shoot me a couple of test SVGs.

PhilterPaper commented 1 year ago

I'll try to come up with a test suite of SVG files in the next week or so: one or two each from MathJax, a dummied-up Barcode, and GNUplot output.

I'm close to releasing PDF::Builder 3.026, but would be willing (after that) to look at pulling out FontManager into a separate package (required for PDF::Builder). I take it that FontManager has all the functionality that you would need for (at least) this work? I seem to recall your asking for such capability in PDF::API2. I think it could work on either system without too much trouble (it of course depends on the usual API2/Builder font routines and objects, etc.). If you want to take a look at it and issue a PR to separate it out (or at least describe to me what you feel needs to be done), be my guest.

A possible alternative would be to simply glom the FontManager code into your own librarie(s). It is Open Source, after all, so as long as you give credit in the comments, I can't complain. You might in such a case need to change some routine names so that there's no collision if someone uses it with PDF::Builder. I didn't originally make FontManager as separate package, as it was so small as to not seem worth the effort (and it depends on either PDF::API2 or PDF::Builder, so it's rather special-purpose). Or, you could use Text::Layout for your code, pulled in by the SVG processor. I should test it before you release it, to make sure it's Windows compatible.

Feel free to shoot me...

A dangerous thing to say to a 'murican! :-( It's Independence Day today. :-)

PhilterPaper commented 1 year ago

JV, I just emailed you a package of 10 SVG files you could use to test your SVG-to-PDF code. It includes some hand-built images, MathJax equation output, GnuPlot output, and a dummied-up UPC-A barcode and a QR Code.

If anyone else is interested in helping out on this, I can send the Zips or attach them here.

sciurius commented 1 year ago

Continuing here, since the email traffic seems to have problems.

To handle fonts I want to add a callback function. This will be called from the PDF::SVG converter with three parameters: pdf, gfx and style.

Style is a hash ref with a number of keys, of which the following are most relevant for font handling:

The callback function is responsible for setting the current font to something matching the style. The calling application is free to use whatever mechanism (FontManager, Text::Layout, Font::Config, ...).

PhilterPaper commented 1 year ago

I want to make sure I understand your requirements. Are you looking for PDF::Builder to provide a wrapper (with standard name and parameter list) that calls FontManager() and gfx->text(), or are you looking to have PDF::SVG provide the function (one of at least two, based on whether you're invoking API2 or Builder, or using Text::Layout etc.), that calls FontManager and text in turn? Could you show a code example to illustrate what you want me to provide? I'm not familiar with registering provided code as callbacks, if that's what you're looking to do.

As far as the Style hash goes, FontManager has 'italic' and 'bold' flags (options) 0/1. Potentially there's a problem with different family names (faces) being provided by the SVG via PDF::SVG. For example, FontManager has a default entry (for core fonts) of 'Times' face (not 'Times-Roman'). Depending on the italic and bold flags, Times-Roman, Times-Italic, Times-Bold, or Times-BoldItalic will be used. If someone is going to ask for a face (family) of 'Times-Roman', it won't be found (technically it's wrong anyway -- Roman is simply the non-italic non-bold variant of Times). Either the glue/callback routine you're specifying will need to figure out the correct face name (and possibly the italic and bold flags), or I could add some sort of alias entry for each face, allowing Times-Roman or TimesRoman as an alias for Times face. I really don't want to have to decode full corefont names such as Times-Bold into face=>Times, italic=>0, bold=>1. Let's at least get that settled before going any further on this.

PhilterPaper commented 1 year ago

Continuing our offline discussion about translating A/a arc in SVG into something in PDF, I'm not sure what the best path forward is. The description of SVG Arc (path command) sounds a lot like a "bogen", which defines a (circular) arc and you need to specify the larger or smaller of the resulting arcs between two points, given the radius, as well as the direction of travel. A Builder "arc" call expects you to specify the center point, radii, and sweep angles. I spent much of yesterday trying to determine the center point, given the radii and two points, but so far haven't come up with a general case (just constant x and constant y cases). Builder has _arctocurve() to break down an elliptical arc into short chords, from two points and two radii, so much of the work may already be done. An arc() call doesn't ask for the "larger" or "smaller" arc -- I suppose specifying the center point of the ellipse takes care of selecting which arc. Otherwise it would be much like the bogen() call with two radii instead of one radius? I will play with it some more, and try emailing you some code.

sciurius commented 1 year ago

For the arcs: Yes, it would be just like bogen but with 2 radii. The new routine would effectively supercede bogen, since you can get the bogen functionality by passing two equal radii.

As for progress: Attached is the PDF for ATS_flow. ATS_flow

sciurius commented 1 year ago

For the callback: This is how the converter is currently used:

    my $p = PDF::SVG->new
      ( ps => { pr => { pdf => $pdf } },
        atts => { },
        fc => \&my_fc_handler );
    $p->process($data);      # or filename or file handle
    my $o = $p->xoforms;
    for ( @$o ) {
        # $_ is a hash with a.o. `xo`, the XObject with the image.
    }

(Note that the ps pr pdf hashes is the way my current API works. This will probably change.)

sub my_fc_handler( $pdf, $gfx, $style ) {

    # Determine font using $style->{'font-family'} etc..

    $the_font = $pdf->font(...);
    $the_size = $style->{'font-size'};

    # And set it.
    $gfx->font( $the_font, $the_size );

}

So PDF::Builder provides the callback function and have it interface with FontManager.

sciurius commented 1 year ago

Progress so far -- Rewrote much of the code from PoC to operational quality. API is stabilizing and I started to document it.

pdfsvg.html.gz

Attached are the results from the unmodified SVG samples that you sent me. Note that output for MJdisplay is missing due to the use of embedded <svg> (that I have not completely implemented).

progress.pdf

I hope you can fix the elliptic arc code.

PhilterPaper commented 1 year ago

Docs and results are looking nice!

As I said before, if necessary I think we can do without the recursive <svg> capability -- it seems to be for a display equation tag (label) over on the right side, in a dynamically variable-width page (with centered equation). If it's not a clean and reasonably simple fix, don't worry about it (I could specify a tag separately and right-justify it vertically centered on the display image, with the current font). I think I sent you a replacement SVG without the tag and thus without a recursive call.

I'm still looking at the elliptical arc bogen derivative. I think I found some misbehavior in the bogen (circular arc) code that will need to be fixed. :-( I also need to look at the font handling (callback, etc.).

sciurius commented 1 year ago

I'm about to give up on the MJdisplay. I've got most of the embedded <svg> implemented and it seems to work fine with most examples that I can find on the web, but the SVG that MathJax produces is insane and I cannot even imagine how it should ever work.[^1]

I'm not alone, none of the SVG tools I have at hand succeed in processing MJdisplay, only the major browsers can.

[^1] For example, what to do with a viewport given by viewBox="21707.5 -1749.5 1 2999". Yes, that's a box of 2999 pixels wide and 1 pixel high.

PhilterPaper commented 1 year ago

Back on the 11th I sent you a revision of the Display equation without a tag "(1a)". Did that one work? I see that it still has a bizarre viewBox similar to the one above. Let me know if you need a resend. There may be something weird going on with the viewBox for Display vs Inline equation SVG creation.

By the way, in case you haven't seen it, the SVG definition shows how to handle a number of degenerate cases:

See https://www.w3.org/TR/SVG/paths.html#PathDataEllipticalArcCommands

If I could get the ellipse center (xc,yc) from P1 and P2, rx and ry (well, one of the two ellipses that fit), we could just use the PDF::* arc() method to draw the curve. I have not yet been able to figure it out (I can get Bxc + Dyc = -(A+B) but not a second linear equation to solve for xc and yc). I need to study: https://stackoverflow.com/questions/197649/how-to-calculate-center-of-an-ellipse-by-two-points-and-radius-sizes . There is a nice clean formula for circular arc bogens, but apparently the "sweep angle" for the second point doesn't work for elliptical arcs (i.e., is specific to circular). For a couple of trivial cases (y1==y2 OR x1==x2) it's possible to find xc and yc for an ellipse, but for a general case I'm stuck. It must be possible, for SVG calculates the arc without an explicit ellipse center!

SVG also allows a rotated (around xc,yc) axes for semimajor and semiminor diameters ("tilted" ellipse), which should be addressed for the most general translation of SVG "A" to PDF.

One thing to clear up: how do you read the PDF::API2 and PDF::Builder bogen definitions, with regards to "large" arc switch and "direction"/"flipped"/"reversed" arcs? Two points define two possible circles or ellipses, from which we select one of four possible arcs:

Unfortunately I can't spend much time on this right now; I have a customer store in crisis. :-(

Add: @terefang, on the chance that you wrote and named this function, do you recall which arc the "larger" was intended to be? Namely, is it the complement (remainder) of the short arc, or is it always "flipped" to the other side? The former case has to reverse the direction of the arc, reversing the $spf parameter's meaning, while the latter case "flips" to the other circle, while preserving the direction.

terefang commented 12 months ago

humm ... 20+ old code.

i remember that i implemented arc/bogen drawing using clockwise semantics.

me thinks a picture would explain that

image

using "flip" would switch to counter-clockwise drawing instead.

terefang commented 12 months ago

looking at the bogen code.

i definitely did not write the code for the $spf flag handling.

PhilterPaper commented 12 months ago

OK, thanks for checking. I suspected that you might have been involved, as "bogen" is a German word and you're Austrian, and one of the original authors of API2. I don't know if you looked at the API2 code or Builder, but I extensively rewrote much of Content.pm (including bogen()) for Builder.

It appears that both API2 and Builder implement it as you intended (flip to "other" circle for the "larger" arc), preserving the direction of travel, rather than reversing the direction to take the remainder arc. (Yes, the small arcs are the same size in both cases, and the large arcs are the same size -- it's just which circle is selected.)

I will update the Builder bogen() documentation to clarify what's going on.

sciurius commented 12 months ago

Back on the 11th I sent you a revision of the Display equation without a tag "(1a)".

Yes, that one yields almost ok (there's a small issue with standalone t-paths that I have to look into).

MJdisplayNoTag.pdf

sciurius commented 12 months ago

Progress so far: Added partial support for @font-face and solved the MathJax t-paths.

BTW I ran into the problem that the MathJax formulae turned out to be a bit fat. Investigation showed that MathJax displays glyphs using curves (we already knew that) but the glyphs are filled and stroked. The stroke is very thin which, according to the PDF specs, should result in the smallest stroke possible, i.e. 1 device pixel. On screen 1 device pixel is, however, substantial. The browsers seem to solve this by making the strokes more or less transparent, something we cannot do in PDF. I have added a special tweak in case MathJax glyphs are drawn to disable the stroke, which gives much better results.

mj.pdf

PhilterPaper commented 12 months ago

Looking pretty nice. The "over" division line is a tad thick -- they're filled rectangles with "none" stroke color. Can't you just draw the path and only fill it? Are there any glyphs which are intended to show strokes? Globally, both the stroke color and fill color are the currentColor (black), and stroke-width is 0. PDF insists on a minimum width of a line? If you don't stroke the path (seeing that stroke-width is 0), and just fill it, shouldn't you be OK? This might work only for MathJax, so be sure to check that it doesn't break GNUplot and other SVG sources.

sciurius commented 12 months ago

Stroke width is not 0, it is 3px (see the CSS rule use[data-c]{stroke-width:3px}). Given the scaling (43414.9/98.224ex ≈ 0.014) this would be a stroke width of approx. 0.04px. The rectangles are within the outer g, which has stroke="currentColor", but since stroke-width is zero there should be no strokes at all. I've fixed this.

samples.pdf

PhilterPaper commented 11 months ago

JV, I'm still working on this -- if rx or ry is too small or large, it needs to increase/decrease them proportionately so that the found ellipse fits the two given points. I'm having a lot of trouble trying to come up with a general algorithm for this, and will have to think about it some more. There are also a couple of test cases where angles that aren't n90 degrees don't quite work yet.

I also want to put in optionally drawing the found ellipse(s) so you can see what it's doing, and enable axis rotation (needed by the SVG Arc command). I think since I've put this much work into it, I will go ahead and put it into PDF::Builder (Content.pm), along with a new examples/bogens.pl. Who knows how long it will take to show up in PDF::API2, so you might as well embed a copy in your code (if I name it bogen_ellip, do you see any danger of a name collision with your copy?). I may even just remove the current bogen() code and use the new method with rx=ry=r (in a bogen() stub).

Do you need a fully working bogen_ellip() right away? If so, I can send you what I've got, but if the angles are not n90 degrees, it may not work quite right. There's not a great rush to use PDF::SVG (or whatever you call it) -- I need to get PDF::Builder 3.026 out the door first. Then for 3.027, I will add general SVG image support (using PDF::SVG), an interface to GnuPlot to generate on-the-fly graphs, an interface to MathJax for equations, and enhancements to column() to support images and equations, Along with any minor fixes and enhancements, that should be a full plate, and probably will be at least 3 or 4 months (towards the end of the year) before release.

sciurius commented 11 months ago

Maybe I'm totally off, but does this help: https://gitlab.gnome.org/GNOME/librsvg/-/blob/main/rsvg/src/path_builder.rs (line 199 and on)?

Currently it is not important how you name the function, and whether there are two or one. The functions are internal to PDF::SVG so I can change function names if neccessary. And I'll include the function in PDF::SVG as long as needed. I do not need the fully working function for my --and your-- purposes but anything that more or less works is better than nothing.

I have successfully integrated PDF::SVG in ChordPro, using the Text::Layout::FontManager. It works really nice.

Other improvements: alternatives for font-family, e.g. "Verdana, Times-Roman, Serif".

PhilterPaper commented 11 months ago

Interesting. The source points to the W3.org Tech Report on SVG Elliptical Arcs, which presents all the steps. It's a bit different approach than what I inherited from bogen(), but I'll have to work through it and see if it does a better job. Of special interest is changing the sizes of rx and ry -- does it do better than what I had come up with. I also see that it claims to come up with the proper arc without having to find both ellipses. I need to check if it uses the same coordinate system and direction of rotations as the rest of PDF uses. Well, I'll have to start nibbling at it this week, in between fixing the customer's not-as-broken-now shop.

Other improvements: alternatives for font-family, e.g. "Verdana, Times-Roman, Serif".

Is this something you already implemented in Text::Layout::FontManager, or is it something you're asking for in my FontManager? Does this change the callback you specified earlier? My understanding is that in HTML/CSS, a font fallback list is examined whenever a particular glyph cannot be found in the current font. I don't think it's intended to pick Times-Roman if Verdana is unavailable, and a generic Serif if Times-Roman isn't found, although that may be a practical effect. What does yours do?

sciurius commented 11 months ago

It just processes the list and chooses the first font that is available. As fas as the font manager callback is concerned, nothing changes. In the above example the callback will be invoked for Verdana and if it doesn't resolve, it will be called again for Times-Roman, etc. I'm not going deeper, i.e., the glyph level. This should be dealt with at the appropriate (low) level maybe with help of some external library. Doing it all in perl will slow down PDF generation to a possibly unacceptable performance. HarfBuzz?

At the user level it would be nice if I would be able to write:

$text->font("Verdana, Times-Roman, Serif",12);
$text->text("Arbitrary glyphs go here");
PhilterPaper commented 11 months ago

OK, finally I think I have a working elliptical bogen method that you can use for the SVG "arc" command. It was a lot of work, as in some cases the algorithms used a flipped coordinate system (y grows down, positive angles are clockwise). I added the x/y axis rotation that arc needs, and also added a "background" full ellipse to clarify how the arcs are generated. Note that the calls have changed a bit, with move, direction, and large/small flags moved to %opts, along with new rotate and full (background ellipse).

The attached Perl file uses either PDF::Builder or PDF::API2, and produces the same results (except on page 1, where I think it demonstrates some bugs in the API2 bogen() call). Page 2 is the same as page 1, except that bogen_ellip() is used with rx = ry. Page 3 demonstrates elliptical bogens, page 4 rotates their rx/ry axes by +30 or -60 degrees, and page 5 shows the effects of too small rx or ry (grows them enough to get it to work).

bogen_ellip.pl.txt

JV, once you tell me you're happy with bogen_ellip(), I will put it (and the example) into PDF::Builder.

sciurius commented 11 months ago

Great job! I now get the expected results from my SVG arcs!

A couple of minor details.

bogen_ellip now has an %opts. Do you intend to change that for circular bogen as well?

The POD still uses @opts.

Is there an advantage of using bogen over bogen_ellip with r1=r2? If not, we could eliminate bogen (have it call bogen_ellip).

In the SVG package, I've added the bogen and bogen_ellip code in a separate package, PDF::Builder::Bogen. This is to keep the code separate and gives credits to PDF::Builder (and you). In some future (when both Builder and API2 support the calls) it may be removed

Thanks a lot!