plotly / Kaleido

Fast static image export for web-based visualization libraries with zero dependencies
Other
348 stars 33 forks source link

Kaleido is 50% slower than Orca #138

Open natbprice opened 2 years ago

natbprice commented 2 years ago

In my brief testing Kaleido is taking about 50% longer than Orca to when saving a series of plots in a loop. Are there any benchmarks on the relative performance?

If this is known issue, then it would be helpful if this was mentioned under "Disadvantages" in the main README.

If this should not be the case, then I would appreciate any suggestions on improving the performance of Kaleido. I am wondering if it could be something as simple as a difference in default inputs/settings between the two implementations or something related to the R/Python interface I am using.

I am exporting images from R on Windows 10.

R Configuration: R 4.2 plotly 4.10.0

Python Configuration: plotly 5.8.2 python-kaleido 0.2.1

Orca: orca 1.3.1


Edited: Add reproducible example

# Packages
library(plotly)
#> Loading required package: ggplot2
#> 
#> Attaching package: 'plotly'
#> The following object is masked from 'package:ggplot2':
#> 
#>     last_plot
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following object is masked from 'package:graphics':
#> 
#>     layout
library(rbenchmark)

# Sample plot
p <- plot_ly(z = ~volcano) %>% add_surface()

# Initialize
kaleido_scope <- kaleido()
orca_server <- orca_serve()
#> Warning: 'orca_serve' is deprecated.
#> Use 'kaleido' instead.
#> See help("Deprecated")

# Run benchmarks
benchmark(
  "kaleido" = {
    kaleido_scope$transform(p, "kaleido.png")
  },
  "orca" = {
    orca_server$export(p, "orca.png")
  },
  replications = 20,
  columns = c(
    "test",
    "replications",
    "elapsed",
    "relative",
    "user.self",
    "sys.self"
  )
)
#>      test replications elapsed relative user.self sys.self
#> 1 kaleido           20   24.44    1.584      0.44     0.05
#> 2    orca           20   15.43    1.000      0.40     0.03

# Shutdown
kaleido_scope$shutdown()
orca_server$close()
#> [1] TRUE

Created on 2022-06-10 by the reprex package (v2.0.1)

SterlingButters commented 1 year ago

The PyPI page boasts: "Kaleido starts up about twice as fast as Orca, and uses about half as much system memory" so I'm interested in this thread.

AbdealiLoKo commented 1 year ago

We are evaluating orca vs kaleido too. We have some prod environments using kaleido and some using orcaa right now - and was planning on fully shifting to kaleido for some of our older setups kaleido is definitely a lot easier to install, but if there are any inputs on this from the maintainers (@jonmmease ?) about the performance - that would be great

b-a0 commented 7 months ago

I see a similar issue where kaleido is noticeably slower than orca in Python (but not 50%).

Here is my "benchmark" that I've run in VS code as code cells:

import numpy as np
import plotly.express as px

N = 100
%%time
for idx in range(0, 100):
    x = np.random.random(N)
    y = np.random.random(N)
    fig = px.scatter(x=x, y=y)
    fig.write_image(f"../data/temp/image-orca-{idx:03d}.png", format="png", engine="orca")
CPU times: total: 3.62 s
Wall time: 8.82 s
%%time
for idx in range(0, 100):
    x = np.random.random(N)
    y = np.random.random(N)
    fig = px.scatter(x=x, y=y)
    fig.write_image(f"../data/temp/image-kaleido-{idx:03d}.png", format="png", engine="kaleido")
CPU times: total: 4.22 s
Wall time: 10.4 s

I pulled the data creation into the loop to ensure that every plot is unique and no caching is occuring.

My package versions are below: