The ubiquitous MPI environment in HPC cluster + Work Stealing Pattern + Distributed Termination Detection = Efficient and Scalable Parallel Solution.
pcircle
contains a suite of file system tools that we are developing at OLCF
to take advantage of highly scalable parallel file system such as Lustre and
GPFS. Early tests show very promising scaling properties. However, it is still
in active development, please use it at your own risk. For bug report and
feedbacks, please post it here at https://github.com/olcf/pcircle/issues.
To jumpstart and do a quick test run on MacOS:
$ brew install pkg-config libffi openmpi python
$ pip2 install virtualenv
$ virtualenv pcircle
$ source ~/pcircle/bin/activate
$ (pcircle) pip2 install git+https://github.com/olcf/pcircle@dev
To run a simple test:
$ mpirun -np 4 fprof ~
This also shows the core dependencies of pcircle: python
, libffi
, and openmpi
. For Linux alike, we need their dev rpms. For example:
sudo yum install openmpi-devel
sudo yum install libffi-devel
For CentOS/Redhat, we have a prebuilt rpm:
sudo yum install \
https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
pcircle
through URL: sudo yum install \
https://github.com/fwang2/pcircle-rpm/raw/master/RPMS/pcircle-0.17.1-1.el7.noarch.rpm
You can also build one from SPEC file maintained at fwang2/pcircle-rpm
repo.
Note: this is a bit out of date, -h
shows current options:
"FCP: A Fast and Scalable Data Copy Tool for High Performance Parallel File Systems", by F. Wang, V.G.V. Larrea, D. Leverman, S. Oral, at CUG'2015.
"A Bloom Filter BAsed Scalable Data Integrity Check Tool for Large-scale Dataset", by S. Xiong, F. Wang, and Q. Cao, at PDSW'2016.
"Diving into Petascale Production File Systems through Large Scale Profiling and Analysis", by F. Wang, H. Sim, C. Harr and S. Oral, at PDSW'2017.