natverse / nat

NeuroAnatomy Toolbox: An R package for the (3D) visualisation and analysis of biological image data, especially tracings of single neurons.
https://natverse.org/nat/
64 stars 28 forks source link

reading neurons from `neuronlistfh` objects is very slow #402

Closed jefferis closed 5 years ago

jefferis commented 5 years ago

There is a new behaviour in the latest version of R which interacts very badly with a piece of code that loads neurons from neuronlistfh objects. Basically it forces garbage collection after reading every single neuron. The relevant code in "[[.neuronlistfh" was always crufty. I'm not sure if the external problem (in filehash package) is now fixed.

I think the definition of showConnections must have changed in R >=3.6.0

> showConnections
function (all = FALSE) 
{
    gc()
    set <- getAllConnections()
    if (!all) 
        set <- set[set > 2L]
    ans <- matrix("", length(set), 7L)
    for (i in seq_along(set)) ans[i, ] <- unlist(summary.connection(set[i]))
    rownames(ans) <- set
    colnames(ans) <- c("description", "class", "mode", "text", 
        "isopen", "can read", "can write")
    if (!all) 
        ans[ans[, 5L] == "opened", , drop = FALSE]
    else ans[, , drop = FALSE]
}
<bytecode: 0x7fe0f1afef58>
<environment: namespace:base>
jefferis commented 5 years ago

The original problem in filehash was here https://github.com/rdpeng/filehash/pull/3. It looks like it was fixed 5 years ago! So depending on filehash >=2.3 should be fine.

jefferis commented 5 years ago

The problem in my code is here:

https://github.com/jefferis/nat/blob/f164b036ff755b187e81817f6a5e0498c4b3881e/R/neuronlistfh.R#L253-L276

switching from showConnections to getAllConnections seems to work fine. But I now see no need to do this at all given filehash fix.