Jiefei-Wang / SharedObject

Sharing R objects across multiple R processes without duplicating the object in memory
45 stars 3 forks source link

error in recording a shared memory info #6

Closed alorchhota closed 4 years ago

alorchhota commented 4 years ago

I was working with SharedObject package and it was working just fine. Suddenly, it stopped working. Now I cannot even create a shared object using share() function.

> library(SharedObject)
> mat <- matrix(0, nrow = 3, ncol = 3)
> shared_mat <- share(mat, copyOnWrite = F)
Error in C_createSharedMemory(x, dataInfo) : 
  error in recording a shared memory info

Any idea why it might be happening? Here is sessionInfo:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /software/apps/R/3.6.1/gcc/5.5.0/lib64/R/lib/libRblas.so
LAPACK: /software/apps/R/3.6.1/gcc/5.5.0/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] SharedObject_1.0.0

loaded via a namespace (and not attached):
[1] compiler_3.6.1      parallel_3.6.1      Rcpp_1.0.5         
[4] xptr_1.1.3          BiocGenerics_0.32.0
Jiefei-Wang commented 4 years ago

Hi @alorchhota ,

This is an error thrown by the boost library. My guess is that you may have exhausted your shared memory, which is different from your physical memory. I am not familar with your OS but on Ubuntu its size can be checked by

> df
Filesystem     1K-blocks      Used Available Use% Mounted on
...
tmpfs            4194304     44396   4149908   2% /dev/shm

I am not sure if you shared memory is mounted on the same location, please refer to your OS manual for that.

PS1: I see you are using the initial version(1.0.0) of the sharedObject, please consider to update it to the latest one for the initial version is known to have some memory issues and there has been a huge change to the package since then. The problem might have been fixed, or you can get a more informative error for that.

PS2: If you do not want to upgrade the package, rebooting your system should solve most problems since it will clear up all shared memory.

Best, Jiefei

alorchhota commented 4 years ago

Hi Jiefei, thanks for you response. I was actually working on a computing node in a cluster. I don't have the rights to restart the node by myself. I am sharing a selected portion of df outcomes here:

$ df
Filesystem                                           1K-blocks          Used    Available Use% Mounted on
storage0008-ib.bluecrab.cluster:/s8p3/abattle4    107374182400  103710480384   3663702016  97% /work-zfs/abattle4
bc-software.bluecrab.cluster:/exports/d1/apps       1031988224     789645312    195120128  81% /software/apps
bc-software.bluecrab.cluster:/exports/d1/centos7    1031988224     789645312    195120128  81% /software/centos7

I have have write permission only in /work-zfs/abattle4, but not in /software/apps or /software/centos7. Any idea how to free up this shared memory? Can I do it myself? Or do I have to ask the admin?

PS: I will try a new version of ShareObject.

alorchhota commented 4 years ago

by the way, OS information is here:

$ lsb_release -a
LSB Version:    
Distributor ID: CentOS
Description:    CentOS Linux release 7.6.1810 (Core) 
Release:    7.6.1810
Codename:   Core

$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
alorchhota commented 4 years ago

Hi @Jiefei-Wang, I can create a shared object using a different R installation. R 4.0 works, but not R 3.6. If I exhausted the shared memory, should I not expect the same error for both installations?

Jiefei-Wang commented 4 years ago

Hi @alorchhota ,

For your question:

(1) Any idea how to free up this shared memory? There are some internal functions for that purpose. If the package data is still in a good shape, you can manually release the memory by

> x <- share(1:10)
> getDataInfo()
            dataId processId typeId length totalSize copyOnWrite sharedSubset
1 7340785861132288     30245      2     10        40           1            1
  sharedCopy
1          0
> sapply(getDataInfo()$dataId, .removeObject)
$`7340785861132288`
NULL

Note that these functions are specific for 1.0.0. The only concern I have is that you may have broken the package data structure so that this option is not available to you, but at least you can give it a shot.

If you know how to use boost library in C++, the following code can help you to completely clear up the shared memory that is preserved by the package(again, for version 1.0.0 only)

#include <boost/interprocess/shared_memory_object.hpp>
boost::interprocess::shared_memory_object::remove("sharedObjectCounter")
boost::interprocess::shared_memory_object::remove("shared_object_package_spaceX64")

(2) R 4.0 works, but not R 3.6

Are you using the same version of SharedObject? As I said, there has been a lot of changes since 1.0.0. The error you see from one version may not be observable in another(and the bug may not be a memory issue). Please try to use the latest version since it is not worthy to debug the old package.

Please let me know if you have more questions.

Best, Jiefei

alorchhota commented 4 years ago

Thanks @Jiefei-Wang! sapply(getDataInfo()$dataId, .removeObject) worked! In fact, getDataInfo() returned a data.frame with 104,853 rows. After removing each data, I could create shared object again.

Now, I am using ShareObject v1.2.2 (with R 4.0). So far, it is working fine for me. But, can I face similar issues in future with v1.2.2? It seems objects were not removed in v1.0.0. What should I do to avoid such not-free memory in v1.2.2? Any way to check if there is any not-removed object?

Jiefei-Wang commented 4 years ago

Glad to hear it. It seems like you did not exhaust the shared memory, you just exhaust the package internal shared data, whose size is about 1MB in the shared memory. That is why you get the complaint from boost: it cannot add more object records to the package internal vector(This behavior is for version 1.0.0 only, that's why you would not get the same error in 1.2.2). Please do not use the shared memory in this way since the package is designed for sharing a few large objects, not tons of small objects.

For the latest version, there is a set of developer APIs to do it:

> library(SharedObject)
> x <- share(1:10)
> listSharedObject()
  Id size
1  1   40
> freeSharedMemory(1)
[1] TRUE
> listSharedObject()
[1] Id   size
<0 rows> (or 0-length row.names)

In most cases, the package will release the memory for you. Please use it only when you encounter a similar problem like in version 1.0.0.

JFYI, if you are comfortable with calling C++ function in R, you actually can use the developer APIs to have a low-level control of the shared memory

## Allocate a shared memory of size 1024 bytes
## Return the id of the shared memory
> allocateSharedMemory(1024)
[1] 2
## Map it to your current process
## You can call this function at any R process
## Return a pointer which you can operate on at C++
> mapSharedMemory(2)
<pointer: 0x7fb39911b000>
## After use the shared memory, unmap and free the shared memory.
> unmapSharedMemory(2)
[1] TRUE
> freeSharedMemory(2)
[1] TRUE

Best, Jiefei

alorchhota commented 4 years ago

Thank you so much for the functions in the developer API. These would be helpful.

Though I no longer use v1.0.0, it did not release the memories properly, even after closing the R session. I figured two reasons for such a big number of objects: 1) sharing a data frame with ~1k columns, 2) running the script 100s of times. Luckily, in the latest release, memory gets released once the R session is over, so it is no longer a problem.

Thanks for your quick response!

Jiefei-Wang commented 4 years ago

You are welcome! Please feel free to open an issue if you have more questions.

Best, Jiefei