DillonHammill / CytoExploreR

Interactive Cytometry Data Analysis
61 stars 13 forks source link

Error in fcs_to_cytoset and HDF5: infinite loop closing library #82

Closed Sithara85 closed 3 years ago

Sithara85 commented 4 years ago

Hi,

I am running the flow cytometry analysis using openCyto and multiple flow cytometry packages from RGLab. We are processing ~10,000 samples. I am getting below error when I reach ~3100 samples. What could be the problem? I am not very familiar with HDF5 library issues. Is it something you can help me with?

Mike has been working with me to resolve some of the issues with CytoML and docker image issues I had when I developed the Rscript. I am using a PBS script to run Rscript and Singularity function to save gatingset and wsp files.

Thank you, Sithara

DillonHammill commented 4 years ago

@Sithara85, wow that is a lot of samples! I don't think that this is an issue that I can help you with, but are you able to provide the error message/traceback so that I can check if it is related to CytoExploreR?

If it is not related to CytoExploreR, the cytoverse team will be able to help with HDF5 library issues. There are also docker images available for CytoExploreR which contain all the cytoverse packages - may be this will be of use to you? The only thing is CytoML will probably not be configured properly as it requires and additional docker image as you described. Perhaps the cytoverse team have an image where this configured correctly?

Once we have sorted out the loading issue, I am happy to provide support on the CytoExploreR side of things so you can avoid some of the bottlenecks that may cause the software to crash with such a huge dataset.

DillonHammill commented 3 years ago

Closing this issue as it has been lying dormant for a while now. Feel free to open a new issue if you need any additional help.

malcook commented 4 weeks ago

Having just now resolved this same issue for myself, I thought I'd share my solution here on the chance similarly afflicted might benefit.

In my case, the root cause was TMPDIR not being large enough to hold all the temporary HDF5 files created by flowWorkspace operations.

It can be challenging to witness this since R contrives to cleanup its TMPDIR upon exit. As part of your diagnostics, you might choose to monitor its usage while your workflow is running. On linux, you can get see this updated every 2 seconds with:

watch du -hl -t tmpfs

Assuming this is in fact your problem, the solution-space can be multifold: