jfree / jfreesvg

A fast, lightweight Java library for creating Scalable Vector Graphics (SVG) output.
http://www.jfree.org/jfreesvg
GNU General Public License v3.0
319 stars 58 forks source link

Optimizations for smaller output #29

Open ghost opened 3 years ago

ghost commented 3 years ago

Configurable Features

What do you think SVG of having pluggable features that can be disabled? For my purposes, I don't need linear gradients, radial gradients, embedded images, or clipping. At the moment, instantiating SVGGraphics2D also allocates hash maps and array lists, without necessity (in my case). Additionally, the filePrefix and fileSuffix variables are assigned, even though those values will never be used. Same for the defsKeyPrefix and call to System.nanoTime().

By default all these features can be enabled, yet having a configurable way to disable them (builder pattern, maybe?) would be handy. Something like:

new SVGGraphics2D.Builder()
  .setLinearGradients(false)
  .setRadialGradients(false)
  .setEmbeddedImages(false)
  .setDefinitionKeys(false)
  .setEmbeddedFonts(false)
  .build()

The setEmbeddedFonts(false) would be synonymous with setting the rendering hint key of KEY_DRAW_STRING_TYPE to the value of VALUE_DRAW_STRING_TYPE_VECTOR.

Configurable Element Output

Along the lines of configurable features, having a bit more control over the SVG output would be amazing. For example, I'd like to eliminate the xmlns:jfreesvg namespace attribute declaration along with xmlns:xlink and <defs>. This releates to the configurable features idea. Since I know I don't need gradients or clips, those inner loops will never execute. Having the ability to suppress <defs></defs> altogether would be grand.

Here's an SVG generated with JFreeSVG:

equation-9.txt

Here's the same SVG after minifying using svgcleaner:

equation-9-sm.txt

I'd like to get the output from JFreeSVG closer to the minified version to help improve downstream rendering performance.

Configurable Buffer Size

SVGGraphics2D calls the default StringBuilder constructor, which is sized to about 16 bytes. This will result in memory reallocations and array copies. We know that the minimum size for SVG contents will exceed 16 bytes.

The constructor provides a way to provide a StringBuilder of a preallocated size, but that strikes me as an implementation detail that shouldn't be exposed to calling classes. Rather, how about the following:

public SVGGraphics2D( width, height, units, bufferSize )

This would allow for reusing the buffer (as opposed to reallocating it) by calling setLength() on the buffer. Moreover, since we know the start of the SVG is static, we can do:

StringBuilder buffer = new StringBuilder(bufferSize);
String prefix = "<?xml version='1.0'?><svg ";
buffer.append( prefix );
// ... code that uses the SVGGraphics2D
// ... then, internally, reset back to the starting point, no need to re-append:
buffer.setLength( prefix.length() );

The constructor comment appears to be incorrect:

This constructor is used by the {@link #create()} method, but won't normally be called directly by user code.

Instead, it may be more accurate to note that clients can use this constructor to provide a custom buffer (to avoid micro-reallocations); it did not appear that the create method called this particular constructor. Maybe an outdated comment?

Profiling

Here's a profile dump of the application's hot spot:

--- 2539794705 ns (8.27%), 254 samples
  [ 0] jdk.internal.math.FloatingDecimal$BinaryToASCIIBuffer.dtoa
  [ 1] jdk.internal.math.FloatingDecimal.getBinaryToASCIIConverter
  [ 2] jdk.internal.math.FloatingDecimal.getBinaryToASCIIConverter
  [ 3] java.text.DigitList.set
  [ 4] java.text.DecimalFormat.doubleSubformat
  [ 5] java.text.DecimalFormat.format
  [ 6] java.text.DecimalFormat.format
  [ 7] java.text.NumberFormat.format
  [ 8] org.jfree.svg.SVGGraphics2D.geomDP
  [ 9] org.jfree.svg.SVGGraphics2D.getSVGPathData
  [10] org.jfree.svg.SVGGraphics2D.fill
  [11] org.jfree.svg.SVGGraphics2D.drawGlyphVector
  [12] sun.font.ExtendedTextSourceLabel.handleDraw
  [13] sun.font.Decoration.drawTextAndDecorations
  [14] sun.font.ExtendedTextSourceLabel.draw
  [15] java.awt.font.TextLine.draw
  [16] java.awt.font.TextLayout.draw
  [17] org.jfree.svg.SVGGraphics2D.drawString
  [18] org.jfree.svg.SVGGraphics2D.drawString
  [19] org.jfree.svg.SVGGraphics2D.drawString
  [20] java.awt.Graphics.drawChars
  [21] be.ugent.caagt.jmathtex.CharBox.draw
  [22] be.ugent.caagt.jmathtex.HorizontalBox.draw
  [23] com.whitemagicsoftware.tex.TeXFormulaTest.test_MathML_SimpleFormula_Success
  [24] com.whitemagicsoftware.tex.TeXFormulaTest.main
mhschmieder commented 3 years ago

That's an interesting suggestion, though I'm not quite ready for final thoughts myself as I only recently switched to JFreeSVG from Batik, so I'm still in the "pleased as punch" column regarding its relative ease and performance, but SVG definitely has a wide range of uses and contexts compared to other vector graphics formats and not all of them require configuring every possible feature or subset. So I think it's good that you put forward this proposal.

I think I recall the Builder Pattern being heavily deprecated (and eliminated internally) by Oracle starting mid-way in the Java 8 development cycle, but don't take my word for it. I've seen many third-party libraries go back and forth on this. I try to follow whichever decision is dominant in the library I'm working with.

I personally dislike the Builder Pattern due to stronger likelihood of null references, complexity of debugging, programming errors, etc. But I'm not religious about these things, and anyway I'm just a contributor and not in any leadership role on this project. I'd also want to first see what's coming in Java 15 (and get more up-to-speed with Java 14) to see if there's an ideal implementation pattern that also mitigates your resource consumption concerns (vs. the existing hash map approach).

ghost commented 3 years ago

I personally dislike the Builder Pattern due to stronger likelihood of null references, complexity of

An SVGContext would work as well; any simple API to disable/enable functionality for further optimizations and control over the output would be grand. Avoiding null and favouring immutable states would be ideal.

mhschmieder commented 3 years ago

I agree; thanks for thinking this through in detail. I hope to be able to get back to contributing to these projects fairly soon (the job search is 100 hours a week still), and would look forward to running some experiments if I can find the time. It should be just about time for Java 15's release as well, so I'll look for those Release Notes (maybe today) to see if there are any language-level goodies this time around.

UPDATE: I just looked over Java 15's new features (including ones that have been in incubation since Java 13 or Java 14) and there's some powerful stuff in this upcoming release, but much of it would probably be too invasive initially for an existing API. The Text Blocks feature looks to be "official" as of Java 15 and might help readability and maintainability of some strings required for syntax and formatting of SVG here and there though.

ghost commented 3 years ago

Here are some additional findings:

          ns  percent  samples  top
  ----------  -------  -------  ---
  5849788876   18.68%      585  jdk.internal.math.FloatingDecimal$BinaryToASCIIBuffer.dtoa
  3789943918   12.10%      379  java.text.DecimalFormat.subformatNumber
  2149971923    6.87%      215  java.text.DigitList.round
  1299964785    4.15%      130  org.jfree.svg.SVGGraphics2D.getSVGPathData
   959968827    3.07%       96  java.text.DecimalFormat.append
   909960295    2.91%       91  java.text.DigitList.set
   789971300    2.52%       79  java.text.DecimalFormat.doubleSubformat
   689991284    2.20%       69  java.lang.StringLatin1.newString
   659992380    2.11%       66  jdk.internal.math.FloatingDecimal.getBinaryToASCIIConverter
   629988897    2.01%       63  java.text.DigitList.set
   529989725    1.69%       53  jdk.internal.math.FloatingDecimal$BinaryToASCIIBuffer.getChars

Using DecimalFormat may not be ideal here because formatting for I18N numbers and other scenarios needn't be considered.

An implementation of the Ryū algorithm could prove helpful. See repo and implementation.

ghost commented 3 years ago

PR https://github.com/jfree/jfreesvg/pull/30 integrates the Ryū algorithm.

ghost commented 3 years ago

In addition to the Ryū algorithm, there are plenty more minor optimizations that can be made that add up to fairly significant savings. Here are some suggestions, though not all may be applicable or even appropriate:

The result will be another doubling in efficiency. I've dropped JFreeSVG in favour of my own, stripped down version. Where before, doubleToString was taking up ~40% of the total time, it now takes up ~74%:

          ns  percent  samples  top
  ----------  -------  -------  ---
 22889770543   73.83%     2289  com.whitemagicsoftware.tex.RyuDouble.doubleToString
  2500002585    8.06%      250  com.whitemagicsoftware.tex.SvgGraphics2D.toGeometryPrecision

The total time for 1 million TeX to SVG images dropped from ~7 minutes to ~4:45 minutes.

To be concrete, below is some code that may give you some ideas, feel free to use whatever you like for JFreeSVG. This is a partial implementation tuned to the needs for generating TeX formulae, so there's a lot going on with JFreeSVG that isn't required. YMMV.

public final class SvgGraphics2D extends Graphics2D {
  private static final int DEFAULT_SVG_BUFFER_SIZE = 65536;
  private static final String XML_HEADER = "<?xml version='1.0'?><svg ";

  /**
   * Number of decimal places for geometric shapes.
   */
  private static final int DECIMALS_GEOMETRY = 4;

  /**
   * Number of decimal places for matrix transforms.
   */
  private static final int DECIMALS_TRANSFORM = 6;

  /**
   * Initialized with a capacity of {@link #DEFAULT_SVG_BUFFER_SIZE} to
   * minimize the number of memory reallocations.
   */
  private final StringBuilder mSvg;

  /**
   * Filled when drawing paths, not thread-safe.
   */
  private final float[] mCoords = new float[ 6 ];

  /**
   * Transform attribute value, a matrix function.
   */
  private String mTransform = "";

  private Color mColour = BLACK;
  private Font mFont = new Font( "Default", Font.PLAIN, 12 );
  private AffineTransform mAffineTransform = new AffineTransform();
  private final FontRenderContext mRenderContext =
      new FontRenderContext( null, false, true );

  /**
   * Creates a new instance with a default buffer size. Calling classes must
   * call {@link #setDimensions(int, int)} before using the class to ensure
   * the width and height are added to the document.
   */
  public SvgGraphics2D() {
    this( DEFAULT_SVG_BUFFER_SIZE );
  }

  /**
   * Creates a new instance with a given buffer size. Calling classes must
   * call {@link #setDimensions(int, int)} before using the class to ensure
   * the width and height are added to the document.
   */
  public SvgGraphics2D( final int initialBufferSize ) {
    mSvg = new StringBuilder( initialBufferSize ).append( XML_HEADER );
  }

  @Override
  public void draw( final Shape shape ) {
    mSvg.append( "<g" );

    if( !mAffineTransform.isIdentity() ) {
      mSvg.append( " transform='" )
          .append( mTransform )
          .append( '\'' );
    }

    mSvg.append( '>' )
        .append( "<path " );
    appendPath( (Path2D) shape, mSvg );
    mSvg.append( "/>" )
        .append( "</g>" );
  }

  private void appendPath( final Path2D path, final StringBuilder buffer ) {
    if( path.getWindingRule() == 0 ) {
      buffer.append( "fill-rule='evenodd' " );
    }

    buffer.append( "d='" );
    final var iterator = path.getPathIterator( null );

    while( !iterator.isDone() ) {
      int type = iterator.currentSegment( mCoords );

      switch( type ) {
        case 0 -> buffer.append( 'M' )
                        .append( toGeometryPrecision( mCoords[ 0 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 1 ] ) );
        case 1 -> buffer.append( 'L' )
                        .append( toGeometryPrecision( mCoords[ 0 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 1 ] ) );
        case 2 -> buffer.append( 'Q' )
                        .append( toGeometryPrecision( mCoords[ 0 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 1 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 2 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 3 ] ) );
        case 3 -> buffer.append( 'C' )
                        .append( toGeometryPrecision( mCoords[ 0 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 1 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 2 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 3 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 4 ] ) )
                        .append( ' ' )
                        .append( toGeometryPrecision( mCoords[ 5 ] ) );
        case 4 -> buffer.append( 'Z' );
      }

      iterator.next();
    }

    buffer.append( '\'' );
  }

  @Override
  public void fill( final Shape shape ) {
    if( shape instanceof Rectangle2D ) {
      final var rect = (Rectangle2D) shape;

      mSvg.append( "<rect x='" )
          .append( toGeometryPrecision( rect.getX() ) )
          .append( "' y='" )
          .append( toGeometryPrecision( rect.getY() ) )
          .append( "' width='" )
          .append( toGeometryPrecision( rect.getWidth() ) )
          .append( "' height='" )
          .append( toGeometryPrecision( rect.getHeight() ) );

      if( !mAffineTransform.isIdentity() ) {
        mSvg.append( "' transform='" )
            .append( mTransform );
      }

      // Double-duty: closes either height or transform.
      mSvg.append( "'/>" );
    }
    else {
      draw( shape );
    }
  }

  @Override
  public void drawString( final String glyphs, final float x, final float y ) {
    assert glyphs != null;

    final var font = getFont();
    final var frc = getFontRenderContext();
    final var gv = font.createGlyphVector( frc, glyphs );
    drawGlyphVector( gv, x, y );
  }

  @Override
  public void drawString( final String glyphs, final int x, final int y ) {
    assert glyphs != null;
    drawString( glyphs, (float) x, (float) y );
  }

  @Override
  public void drawGlyphVector(
      final GlyphVector g, final float x, final float y ) {
    fill( g.getOutline( x, y ) );
  }

  @Override
  public void translate( final int x, final int y ) {
    translate( x, (double) y );
  }

  @Override
  public void translate( final double tx, final double ty ) {
    final var at = getTransform();
    at.translate( tx, ty );
    setTransform( at );
  }

  @Override
  public void scale( final double sx, final double sy ) {
    final var at = getTransform();
    at.scale( sx, sy );
    setTransform( at );
  }

  @Override
  public FontRenderContext getFontRenderContext() {
    return mRenderContext;
  }

  @Override
  public Font getFont() {
    return mFont;
  }

  @Override
  public void setFont( final Font font ) {
    assert font != null;
    mFont = font;
  }

  @Override
  public Color getColor() {
    return mColour;
  }

  @Override
  public void setColor( final Color colour ) {
    mColour = colour;
  }

  @Override
  public void setTransform( final AffineTransform at ) {
    assert at != null;
    mAffineTransform = new AffineTransform( at );
    mTransform = toString( at );
  }

  @Override
  public AffineTransform getTransform() {
    return (AffineTransform) mAffineTransform.clone();
  }

  /**
   * Resets the SVG buffer to a new state. This method must be called before
   * calling drawing primitives.
   *
   * @param w The final document width (in pixels).
   * @param h The final document height (in pixels).
   */
  public void setDimensions( final int w, final int h ) {
    mSvg.setLength( XML_HEADER.length() );
    mSvg.append( "width='" )
        .append( w )
        .append( "px' height='" )
        .append( h )
        .append( "px'>" );
  }

  /**
   * Generates a matrix transformation string of the given transform.
   *
   * @param at The transform to convert into a string.
   * @return A matrix transformation string.
   */
  private String toString( final AffineTransform at ) {
    return "matrix(" +
        toTransformPrecision( at.getScaleX() ) + ',' +
        toTransformPrecision( at.getShearY() ) + ',' +
        toTransformPrecision( at.getShearX() ) + ',' +
        toTransformPrecision( at.getScaleY() ) + ',' +
        toTransformPrecision( at.getTranslateX() ) + ',' +
        toTransformPrecision( at.getTranslateY() ) + ')';
  }

  /**
   * Call when no more graphics operations are pending and the content is safe
   * to convert to an SVG representation.
   *
   * @return A complete SVG string that can be rendered to reproduce the TeX
   * primitives.
   */
  @Override
  public String toString() {
    return mSvg.append( "</svg>" ).toString();
  }

  /**
   * Has no effect; call {@link #setDimensions(int, int)} to reset this instance
   * to create another SVG document.
   */
  @Override
  public void dispose() {
  }

  private static String toGeometryPrecision( final double value ) {
    return doubleToString( value, DECIMALS_GEOMETRY );
  }

  private static String toTransformPrecision( final double value ) {
    return doubleToString( value, DECIMALS_TRANSFORM );
  }
mhschmieder commented 3 years ago

This is all really good stuff -- great work! I had meant to review some of these aspects myself, but as SVG is the vector graphics format that I have the least experience with (as compared to PDF, EPS, PS and a few others), I was nervous about making mistakes until studying things further.

But I did make similar changes in a very resource-challenged DXF (AutoCAD) parser and in some of my Apache POI handling, and found especially that the StringBuilder append strategy made a minimum of 10:1 difference in some cases and up to a 40:1 or even 100:1 difference in others (e.g. inside tight loops or functions called from tight loops).

Although I would have to take the time to check one of my own projects in depth to re-find the specifics, I also was surprised at what a difference the AffineTransform and Path/GeneralPath/Shape approaches can make in memory and performance.

Looking at the Java source code makes this pretty obvious right away, but doing actual profiling is always best. I started writing library functions just to codify best practices on this stuff, as it's easy to forget if you leave it behind for a while.

jfree commented 3 years ago

I agree this looks really good, I need some time to test it out.

jfree commented 3 years ago

So far I've incorporated the RyuDouble conversion, removed the spaces from the path elements and switched to chars for StringBuilder method calls, and now only output defs if there is something to write. If I find some time I'll pick through some more items.