r-lib / systemfonts

System Native Font Handling in R
https://systemfonts.r-lib.org
Other
92 stars 17 forks source link

systemfonts::system_fonts crashes (segfault) when used in future #41

Open stefanoborini opened 4 years ago

stefanoborini commented 4 years ago

This code produces a hard segfault:

future::plan(future::multiprocess)

fut <- future::future({
  systemfonts::system_fonts()
})

r <- future::value(fut)
print(r)
[1] "Using environment default"

 *** caught segfault ***
address 0x110, cause 'memory not mapped'

Traceback:
 1: systemfonts::system_fonts()
 2: eval(quote({    systemfonts::system_fonts()}), new.env())
 3: eval(quote({    systemfonts::system_fonts()}), new.env())
 4: eval(expr, p)
 5: eval(expr, p)
 6: eval.parent(substitute(eval(quote(expr), envir)))
 7: local({    systemfonts::system_fonts()})
 8: doTryCatch(return(expr), name, parentenv, handler)
 9: tryCatchOne(expr, names, parentenv, handlers[[1L]])
10: tryCatchList(expr, classes, parentenv, handlers)
11: tryCatch({    ...future.value <- local({        systemfonts::system_fonts()    })    future::FutureResult(value = ...future.value, version = "1.8")}, error = function(cond) {    calls <- sys.calls()    structure(list(value = NULL, value2 = NA, condition = cond,         calls = calls, version = "1.8"), class = "FutureResult")}, finally = {    {        {            {                options(mc.cores = ...future.mc.cores.old)            }            future::plan(list(function (expr, envir = parent.frame(),                 substitute = TRUE, lazy = FALSE, seed = NULL,                 globals = TRUE, workers = availableCores(), gc = FALSE,                 earlySignal = FALSE, label = NULL, ...)             {                if (substitute)                   expr <- substitute(expr)                fun <- if (supportsMulticore())                   multicore                else multisession                fun(expr = expr, envir = envir, substitute = FALSE,                   lazy = lazy, seed = seed, globals = globals,                   workers = workers, gc = gc, earlySignal = earlySignal,                   label = label, ...)            }), .cleanup = FALSE, .init = FALSE)        }        options(...future.oldOptions)    }})
12: eval(expr, env)

R 3.6.0, systemfonts 0.2.3 on macos.

lldb backtrace

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x110)
  * frame #0: 0x00007fff6f5a2e47 libdispatch.dylib`_dispatch_mgr_queue_push + 41
    frame #1: 0x00007fff6f59ebab libdispatch.dylib`_dispatch_lane_resume_activate + 58
    frame #2: 0x00007fff44919d9a CarbonCore`connectToCoreServicesD() + 245
    frame #3: 0x00007fff44919c74 CarbonCore`getStatus() + 24
    frame #4: 0x00007fff44919bef CarbonCore`scCreateSystemServiceVersion + 49
    frame #5: 0x00007fff44919964 CarbonCore`FileIDTreeGetCachedPort + 213
    frame #6: 0x00007fff449197d0 CarbonCore`FSNodeStorageGetAndLockCurrentUniverse + 83
    frame #7: 0x00007fff4491965a CarbonCore`FileIDTreeGetAndLockVolumeEntryForDeviceID + 38
    frame #8: 0x00007fff449195b7 CarbonCore`FSMount::FSMount(unsigned int, FSMountNumberType, int*, unsigned int const*) + 75
    frame #9: 0x00007fff44919531 CarbonCore`FSMountPrepare + 69
    frame #10: 0x00007fff5784aa30 CoreServicesInternal`MountInfoPrepare(void***, unsigned int, int, void*, unsigned int const*, __CFURL const*, __CFError**) + 43
    frame #11: 0x00007fff5784a428 CoreServicesInternal`parseAttributeBuffer(__CFAllocator const*, unsigned char const*, unsigned char, attrlist const*, void const*, void**, _FileAttributes*, unsigned int*) + 3209
    frame #12: 0x00007fff57849491 CoreServicesInternal`corePropertyProviderPrepareValues(__CFURL const*, __FileCache*, __CFString const* const*, void const**, long, void const*, __CFError**) + 834
    frame #13: 0x00007fff578490ee CoreServicesInternal`prepareValuesForBitmap(__CFURL const*, __FileCache*, _FilePropertyBitmap*, __CFError**) + 360
    frame #14: 0x00007fff5784ca0f CoreServicesInternal`_FSURLCopyResourcePropertyValuesAndFlags + 581
    frame #15: 0x00007fff4368f24b CoreFoundation`_CFURLCopyResourcePropertyValuesAndFlags + 127
    frame #16: 0x00007fff41bb6236 libFontParser.dylib`FPPathGetCatalogValues(char const*, FInfo*, unsigned long long*, unsigned long long*) + 187
    frame #17: 0x00007fff41bb3ec5 libFontParser.dylib`TFont::CreateFontEntities(char const*, bool, bool&, short, char const*, bool) + 85
    frame #18: 0x00007fff41bb664a libFontParser.dylib`TFont::CreateFontEntitiesForFile(char const*, bool, bool, short, char const*) + 264
    frame #19: 0x00007fff41b6649e libFontParser.dylib`FPFontCreateFontsWithPath + 158
    frame #20: 0x00007fff43a983b7 CoreGraphics`create_private_data_array_with_path + 29
    frame #21: 0x00007fff43a980bf CoreGraphics`CGFontCreateFontsWithPath + 26
    frame #22: 0x00007fff43a97ce9 CoreGraphics`CGFontCreateFontsWithURL + 345
    frame #23: 0x00007fff4528642b CoreText`CreateFontsWithURL(__CFURL const*, bool) + 199
    frame #24: 0x00007fff45368a4c CoreText`CTFontManagerCreateFontDescriptorsFromURL + 37
    frame #25: 0x0000000107ab625a systemfonts.so`addFontIndex(FontDescriptor*) + 378
    frame #26: 0x0000000107ab6c80 systemfonts.so`createFontDescriptor(__CTFontDescriptor const*) + 992
    frame #27: 0x0000000107ab72a8 systemfonts.so`getAvailableFonts() + 520
    frame #28: 0x0000000107aa6103 systemfonts.so`system_fonts() + 467
stefanoborini commented 4 years ago

macos. 10.14.6

thomasp85 commented 4 years ago

Hmm... it seems to be happening deep inside CoreText so I'm unsure how to approach this. Maybe CoreText is simply not safe to run from forked processes. @HenrikBengtsson would you know anything about this?

stefanoborini commented 4 years ago

Just for the record, it does not seem to happen on linux. Seems to be a mac specific problem indeed.

stefanoborini commented 4 years ago

It also does not crash on mac if instead of multiprocess I use multisession. It does crash with multicore

thomasp85 commented 4 years ago

Thanks for confirming that this is indeed something to do it's calling into CoreText from forked processes. Really not sure what to do about this 😕

HenrikBengtsson commented 4 years ago

Does the following also segfault on macOS?

> f <- parallel::mcparallel(systemfonts::system_fonts())
> v <- parallel::mccollect(f)

If so, that's a minimal reproducible example using forked processing. I don't have access to macOS, so I cannot test. It works on Ubuntu.

stefanoborini commented 4 years ago

@HenrikBengtsson Crashes immediately after the first line is executed

> f <- parallel::mcparallel(systemfonts::system_fonts())
> 
 *** caught segfault ***
address 0x110, cause 'memory not mapped'

Traceback:
 1: systemfonts::system_fonts()
 2: eval(expr, env)
 3: doTryCatch(return(expr), name, parentenv, handler)
 4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 5: tryCatchList(expr, classes, parentenv, handlers)
 6: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        sm <- strsplit(conditionMessage(e), "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && isTRUE(getOption("show.error.messages"))) {        cat(msg, file = outFile)        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
 7: try(eval(expr, env), silent = TRUE)
 8: sendMaster(try(eval(expr, env), silent = TRUE))
 9: parallel::mcparallel(systemfonts::system_fonts())
An irrecoverable exception occurred. R is aborting now ...
HenrikBengtsson commented 4 years ago

Interesting. Not that it would be a solution, but it would be interesting to know if preloading {systemfonts} as in:

v0 <- systemfonts::system_fonts()
f <- parallel::mcparallel(systemfonts::system_fonts())
v <- parallel::mccollect(f)

behaves differently, or possibly even work.

stefanoborini commented 4 years ago

@HenrikBengtsson yes, that works. Interesting!

HenrikBengtsson commented 4 years ago

Yes, surely interesting - and one of the obscure properties of forked parallel processing in R (which certainly is not stable).

If so, I guess that:

loadNamespace("systemfonts")
f <- parallel::mcparallel(systemfonts::system_fonts())
v <- parallel::mccollect(f)

might also work, because that also triggers:

> systemfonts:::.onLoad
function (...) 
{
    .Call("sf_init_c", asNamespace("systemfonts"))
    load_emoji_codes()
}

One hypothesis could be that you trigger the necessary initialization in the main R session, which is then inherited by the forked child processes, so it just works there. In contrast, if you initialize in the child process, then that is lost when the fork is terminated leaving you with an object with broken references/pointers. That's just a guess.

I would definitely not rely on this to work. It could just be that it happens to work in this simple example but if you "hit it hard enough" the problem will show up again elsewhere. The problem need to be well understood in order to consider it solved.

thomasp85 commented 4 years ago

systemfonts does some caching which means that the forked process inherits the cache and never call into CoreText. My hypothesis is still that it is simply not safe to call into CoreText from a forked process. There is nothing in the systemfonts code that should illicit this behaviour (I think)

stefanoborini commented 4 years ago

For the record, I tried to workaround the issue in my application by adding the systemfonts call earlier in the calling process, and I get a different segfault later

 *** caught segfault ***
address 0x110, cause 'memory not mapped'

Traceback:
 1: m_str_extents_(x, fontname, fontsize, bold, italic, fontfile)
 2: m_str_extents(txt_data$txt, fontname = txt_data$font.family,     fontsize = fontsize, bold = txt_data$bold, italic = txt_data$italic)
 3: text_metric(x)
 4: optimal_sizes(x[[j]])
 5: dim_pretty(x, part = parts)
 6: flextable::autofit(.)
 7: function_list[[i]](value)
 8: freduce(value, `_function_list`)
 9: `_fseq`(`_lhs`)
10: eval(quote(`_fseq`(`_lhs`)), env, env)
11: eval(quote(`_fseq`(`_lhs`)), env, env)
12: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env)
HenrikBengtsson commented 4 years ago

Thanks for these follow-ups - they're helpful for this package but also as a reference/example to others elsewhere.

@thomasp85, I don't know how it can be done, but if we could come up with a way for a package to detect when it runs in a forked process, then we might be able to give a run-time error rather than random crashes. The next level up would be to declare whether a package, or a specific function, can be forked or not, e.g. Forkable: false in DESCRIPTION.

HenrikBengtsson commented 4 years ago

Oh, I forgot that we might be able to detect whether we're in a fork or not via parallel:::isChild(), cf. https://github.com/HenrikBengtsson/future/issues/224#issuecomment-573345273

stefanoborini commented 4 years ago

I don't think it's metadata worthy. I suspect it's a bug somewhere, possibly even in the Core libraries. It feels kind of weird that you can't perform the operation in a forked process. After all, all processes are forked.

HenrikBengtsson commented 4 years ago

I'd say it is "metadata worthy" because this is not the only case where code/packages/libraries cannot be forked. For example, there are various examples where multi-threading combined with forked processing wreak havoc.

Forked processing is really unstable and I and others often advise against it because of the risk(*) that comes with it. Here is what the author of mclapply() wrote in R-devel thread 'mclapply returns NULLs on MacOS when running GAM' (https://stat.ethz.ch/pipermail/r-devel/2020-April/079384.html) on 2020-04-28:

Do NOT use mcparallel() in packages except as a non-default option that user can set for the reasons Henrik explained. Multicore is intended for HPC applications that need to use many cores for computing-heavy jobs, but it does not play well with RStudio and more importantly you don't know the resource available so only the user can tell you when it's safe to use. Multi-core machines are often shared so using all detected cores is a very bad idea. The user should be able to explicitly enable it, but it should not be enabled by default.

(*) ... and maintenance costs - this issue being just one of many examples.

stefanoborini commented 4 years ago

If you add a metadata about it, it needs to be platform dependent. In this case, it works well on both linux and (probably) windows. It's just not working on windows, and metainfo cannot know how you are going to use the library you install anyway.

As far as I am concerned, the main problem I am facing is that there's no other way to generate reports using flextable and officer under shiny, but to use a future with subprocess. That's how I found the issue. If you don't use a future, it will lock the shiny instance for all users (not only the current user, all of them).

thomasp85 commented 4 years ago

@stefanoborini the second segfault you reported doesn't seem to be related to systemfonts, correct?

You can always use multisession to avoid forking, but I agree that this is all quite annoying

stefanoborini commented 4 years ago

@thomasp85 it was generated by a later call after I tried to force the initialisation as suggested by @HenrikBengtsson, but apparently it then fails later in other calls.

I am not sure I can use multisession in shiny unfortunately, but I haven't checked, and I am not a big expert in R multithreaded execution (or a big expert in R, I'm mostly a python programmer)

thomasp85 commented 4 years ago

But the traceback does not seem to indicate that the segfault has anything to do with systemfonts, so maybe you stumbled on yet another issue with forking in another package...?

HenrikBengtsson commented 4 years ago

... it works well on both linux and (probably) windows

Do we actually know that, or could it be that we just haven't hit the problem there?

I am not sure I can use multisession in shiny unfortunately, but I haven't checked, and I am not a big expert in R multithreaded execution (or a big expert in R, I'm mostly a python programmer)

Not sure why you can't user 'multisession', which parallelizes via R processes running in the background. It might be the firewall doesn't allow you to open ports needed to communicate with the workers. Regardless, you can also try:

plan(future.callr::callr)

which works very similarly to the multisession backend (but doesn't require ports). It's built on top of the callr package.

PS. The term 'multi-threading' is different from 'multi-processing'. R does not support multi-threading, which takes place at a very low level orchestrated by the operating system. Formally, we say that threads belong to and can only run within a process. Almost all parallelization with do in R is done by running multiple processes. There many packages that implement multi-threading, e.g. data.table, but that is done in native code.

stefanoborini commented 4 years ago

@thomasp85 I don't know. I would have to do some more tracing, but since the call is about font metrics I suspect that it's using the same structures returned by systemfonts, and that's where it barfs.

@HenrikBengtsson I tried on linux and I had no problem at all. It works well. I don't have a windows machine to test. So far, the only platform that seem to be affected is macos.

I'll try more tomorrow, now it's late. Thank you both for your help.