Closed timelyportfolio closed 8 years ago
I'm really not surprised that the performance isn't as good.
Is it bad enough to revert back to svglite
in inlineSVG
? It is far worse than I expected, but it is always nice not to create a file.
Probably it's because insertion in string stream requires constructing new strings repeatedly?
@yixuan it seems to be copying the string from C++ to R that's slow. If I rewrite flush()
to be:
void flush() {
stream_.flush();
std::string x = stream_.str();
# env_["svg_string"] = x;
}
I get
Unit: milliseconds
expr min lq mean median uq max neval cld
svg_lite_fun() 9.52 9.88 11.07 10.14 13.19 14.0 10 b
svg_string_fun() 6.71 6.80 8.33 8.08 9.86 10.3 10 a
If I uncomment the last line I get:
Unit: milliseconds
expr min lq mean median uq max neval cld
svg_lite_fun() 9.63 11.2 12.4 11.5 11.8 22 10 a
svg_string_fun() 330.97 336.7 371.3 349.1 431.0 453 10 b
If I construct the STRSXP directly by hand with Rf_mkCharLenCE(&x[0], x.length(), CE_UTF8)
, I get the same performance. It's probably related to R's global string pool - hashing that giant string takes some time.
This a good catch. Thanks Hadley.
One possible way to solve this is to cache the SVG string in C++ rather than in R, and we only do the copy when the string is requested in R. Some pseudo code may look like
svgstring <- function(width = 10, height = 8, bg = "white",
pointsize = 12, standalone = TRUE) {
env <- new.env(parent = emptyenv())
svgstring_(env, width = width, height = height, bg = bg,
pointsize = pointsize, standalone = standalone)
function() {
.Call("get_string_from_cpp", env$svg_string_ptr)
}
}
where env$svg_string_ptr
saves the address of the cached C++ string. And in C++
class SvgStreamString : public SvgStream {
std::stringstream stream_;
Rcpp::Environment env_;
std::string cached_string;
public:
SvgStreamString(Rcpp::Environment env): env_(env) {
stream_ << std::fixed << std::setprecision(2);
env_["svg_string_ptr"] = somehow_return_the_pointer(&cached_string);
}
void flush() {
stream_.flush();
cached_string = stream_.str() + "</svg>";
}
};
This way requires to take care of some other stuffs of course, for example returning proper string when device is closed.
I agree that that's a better interface - although it would be even better to return an Xptr
from svgstring_
, and then wrap with an accessor function. Then flush()
wouldn't have to do anything - you'd call a special method from the accessor.
I didn't think it would solve the performance issue, but I guess flush()
gets called multiple times so it might be a lot better.
Oh yeah, you are right. flush()
can be empty and we direct return string from stream_
.
Do you want to have a go at a PR? Do you get what I mean about the external pointer?
Yeah I think so. I can have a try maybe later this week.
I decide to test for speed differences between
svglite
andsvgstring
. I was surprised thatsvglite
actually is significantly quicker on my machine.I'm wondering now if we should revert
inlineSVG
back tosvglite
.Changing to
runif(1000)
and usingmicrobenchmark
, I get the following.