Closed shlomiya closed 6 years ago
This is the dmtcp.spec file (renamed here to dmtcp.spec.txt) that was used in the latest Fedora release. Note that since then DMTCP has advanced its stable version to 2.5.2
Is there a special reason the package is not part of EPEL-7? I see it is available for EPEL-6 but not for 7? Just curious as that is one of the same platforms OpenHPC is targeting and exactly that platform is missing.
We were quite loaded and didn't get to it, and there wasn't an urgent need for it. If this is required for this project then we can push for an EPEL-7 version.
It is not a requirement. I was just curious as DMTCP seems to have no dependency on any OpenHPC provided packages and getting it into EPEL-7 would open it up to many more users. Especially as it is already part of EPEL-6 it seems like a small step to get it into EPEL-7.
Anyway, I will have a look at the spec file and provide some feedback.
I am still unsure if it makes sense for DMTCP to be part of OpenHPC but the spec file needs a few changes:
OpenHPC packages usually do no differentiate between the main package and a devel package.
With those changes the spec file would match the existing spec files and be ready for inclusion.
I've updated the spec files as you've suggested. Regrading necessary module files, I'm not sure what's missing from the spec file.
Regarding the question about being unsure if it makes sense for DMTCP to be part of OpenHPC. We would have thought that would be obvious. The rationale is that many HPC jobs run for long periods, and need the ability to transparently checkpoint and restart in case of a crash.
DMTCP transparently supports MPI as well as single-host jobs. It's robust and highly scalable. See:
"System-level Scalable Checkpoint-Restart for Petascale Computing" (ICPADS'16); showing scalaibilty to 32,000 cores.
DMTCP already has many users in HPC. If this is not convincing, could you let us know what issues you see in whether DMTCP supports HPC? Thanks!
Thanks for the spec file, but it fails in %prep during %setup
As DMTCP has no OpenHPC dependency and as it is already has an EPEL-6 branch, I think it would potentially reach more people by having an EPEL-7 branch. This is not about DMTCP's functionality, I think that it would just get a bigger possible user-base by being part of EPEL-7. I am just trying to understand why OpenHPC instead of EPEL-7? But that is of course totally up to you.
Did you try to build an RPM with the attached spec file?
DMTCP is currently in the CentOS 7 EPEL Testing repository: https://centos.pkgs.org/7/epel-testing-x86_64/dmtcp-2.5.2-1.el7.x86_64.rpm.html
Nice. I am adding a link to the actual update request:
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2017-9dce83b52a
So, what does that mean for this submission?
@shlomiya do you still want to submit it for OpenHPC now that it is available in EPEL-7 stable branch?
After your previous comments, we decided to make an effort to push it to EPEL-7. I'm a bit unclear regarding the relation between OpenHPC and EPEL-7. Is OpenHPC taking all the packages from EPEL-7, hence DMTCP will be included automatically? If not, we definitely still want to submit it to OpenHPC, and please let me know what are the changes that are required based on the EPEL-7 package.
Some OpenHPC packages already depend on EPEL during runtime and a very small number of package (one if I remember correctly) depend on EPEL during build-time. So in practice I expect that almost all OpenHPC deployments based on CentOS also have EPEL enabled. Looking at one of the install recipes (https://github.com/openhpc/ohpc/releases/download/v1.3.3.GA/Install_guide-CentOS7-Warewulf-SLURM-1.3.3-x86_64.pdf) it actually says:
"The public EPEL repository will be enabled automatically upon installation of the ohpc-release package."
So, yes, for CentOS based deployments EPEL packages are available.
@shlomiya, as Adrian mentioned, we do already rely on EPEL as a dependency for installs on CentOS. So, now that you have successfully landed DMTCP in EPEL-7 that should extend exposure to OpenHPC as well. Consequently, the TSC review panel did not see a need to duplicate the effort given that DMTCP it is not dependent on any specific MPI family.
Software Name
DMTCP
Public URL
http://dmtcp.sourceforge.net
Technical Overview
DMTCP (Distributed MultiThreaded Checkpointing) transparently checkpoints a single-host or distributed computation in user-space -- with no modifications to user code or to the O/S. It works on most Linux applications. This checkpointing functionality introduces new functionality. Some MPI implementations (e.g., MVAPICH) include an MPI-specific checkpoint-restart service that disconnects the network, uses BLCR for single-host checkpointing, and then re-connects the network. In contrast, DMTCP provides distributed checkpointing at the lower POSIX system call layer, and is independent of the particular MPI implementation being used.
Latest stable version number
DMTCP version 2.5.2
Open-source license type
LGPL-3.0
Relationship to component?
If other, please describe:
Build system
If other, please describe:
Does the current build system support staged path installations? For example:
make install DESTDIR=/tmp/foo
(or equivalent)Does component run in user space or are administrative credentials required?
Does component require post-installation configuration.
If yes, please describe briefly:
If component is selected, are you willing and able to collaborate with OpenHPC maintainers during the integration process?
Does the component include test collateral (e.g. regression/verification tests) in the publicly shipped source?
If yes, please briefly describe the intent and location of the tests.
The tests are in TOP_LEVEL/test. To run the full suite, do:
To run a particular test (e.g. the frisbee test), try
The test invokes:
At a still lower level, one can do from top-level:
Does the component have additional software dependencies (beyond compilers/MPI) that are not part of standard Linux distributions?
If yes, please list the dependencies and associated licenses.
Does the component include online or installable documentation?
If available online, please provide URL.
http://dmtcp.sourceforge.net/FAQ.html
Also, a DMTCP install will install various man pages for dmtcp_launch, dmtcp_restart, etc.
Most development occurs on:
https://github.com/dmtcp/dmtcp
DMTCP home page:
http://dmtcp.github.io/index.html
[Optional]: Would you like to receive additional review feedback by email?
- [x] yes - [ ] no