Closed dongahn closed 7 years ago
It doesn't look like the current sched
master builds against the flux-core 0.2.0 rpm installed under /opt on our systems like cab...
cab668{dahn}83: ./configure --prefix=/nfs/tmp2/dahn/FLUX-20160421/inst
cab668{dahn}84: make
<CUT>
make[2]: Entering directory `/nfs/tmp2/dahn/FLUX-20160421/flux-sched/sched'
CC sched_la-sched.lo
CC sched_la-rs2rank.lo
CC sched_la-rsreader.lo
CC sched_la-plugin.lo
plugin.c: In function 'lsmod_cb':
plugin.c:248: error: 'FLUX_MODSTATE_RUNNING' undeclared (first use in this function)
plugin.c:248: error: (Each undeclared identifier is reported only once
plugin.c:248: error: for each function it appears in.)
plugin.c:248: error: too many arguments to function 'flux_modlist_append'
make[2]: *** [sched_la-plugin.lo] Error 1
make[2]: Leaving directory `/nfs/tmp2/dahn/FLUX-20160421/flux-sched/sched'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/nfs/tmp2/dahn/FLUX-20160421/flux-sched'
make: *** [all] Error 2
Ah yes, that was added in flux-core/d3f1cd31f8745a67944b8bd916c6f6c736346231 which was applied after 0.2.0 tag. For build testing you might have to first install a copy of current flux-core master to you destdir. Sorry!
FYI -- I've made some progress here. I will probably need to take care of some stuff next few days and then resume this work towards the end of this week or early next week. If /opt flux-core 0.3.0 comes along, the first sched 0.1.0 rpm will be tested on that. The first sched /opt rpm should be considered as a rpm for rpm packaging testing sake.
Here are my current (completely untested) flux-sched.spec file and module.flux-sched file for TOSS2:
Name: flux-sched
Version: 0.1.0
Release: 1%{?dist}
Summary: Job Scheduling Facility for Flux Resource Manager Framework
Group: System Environment/Base
License: GPLv2+
URL: https://github.com/flux-framework/flux-sched
Source0: %{name}-%{version}.tar.gz
Source1: module.flux-sched
BuildRoot: %{_tmppath}/%{name}-%{version}-root-%(%{__id_u} -n)
#let's not build the debug package for now
%define debug_package %{nil}
#only compress -- no stripping etc
%define __spec_install_post /usr/lib/rpm/brp-compress || :
BuildRequires: flux-core >= 0.3.0
BuildRequires: zeromq4-devel >= 4.1.4
BuildRequires: czmq-devel >= 3.0.2
BuildRequires: json-c-devel
BuildRequires: lua-devel >= 5.1
BuildRequires: lua-posix
BuildRequires: hwloc-devel >= 1.4
Requires: flux-core >= 0.3.0
%description
flux-sched contains the job scheduling facility for the Flux Resource
Manager Framework. It consists of a Flux comms module that handles
all the functionality common to scheduling. The module has the ability
to load one or more scheduling sub-modules that provide specific
scheduling behavior.
%prep
%setup -n %{name}-%{version}
sed -i -e "s|@NAME@|%{name}|" -e "s|@VERSION@|%{version}|" \
%{_sourcedir}/module.flux-sched
%build
# PKG_CONFIG_PATH and PATH should come from flux-core module been
export MODULEPATH=/opt/modules/modulefiles
. /etc/profile.d/[mM]odules.sh
module load python/2.7 python-pycparser python-cffi mvapich2-gnu-shmem
# We want to make this a relocatable package
# We want to be able to install multiple flux-sched RPMs so we use
# ${name}-${version} install directory
./configure --prefix=/opt/%{name}-${version}
make %{?_smp_mflags}
export FLUX_TESTS_LOGFILE=t
make check
if [ $? -eq 0 ]; then
cat t/*.out t/*.log
exit 1
fi
%install
# RPM_BUILD_ROOT comes from BuildRoot tag.
rm -rf ${RPM_BUILD_ROOT}
mkdir -p ${RPM_BUILD_ROOT}
make install DESTDIR=${RPM_BUILD_ROOT}
find ${RPM_BUILD_ROOT} -name *.la | while read f; do rm -f $f; done
install -D -m 555 %{_sourcedir}/module.flux-sched \
${RPM_BUILD_ROOT}/opt/modules/modulefiles/%{name}/%{version}
%clean
rm -rf $RPM_BUILD_ROOT
%files
%defattr(-,root,root,-)
%dir /opt/modules/modulefiles/%{name}
/opt/modules/modulefiles/%{name}/%{version}
/opt/%{name}-%{version}
%post
%changelog
* Tue Apr 26 2016 Dong H. Ahn <ahn1@llnl.gov> 0.1.0-1
- Build from initial flux-sched-0.1.0 tag
#%Module1.0
# vi:set filetype=tcl:
#
# Load prereqs
if { ![is-loaded python/2.7] } {
module load python/2.7
}
if { ![is-loaded python-pycparser] } {
module load python-pycparser
}
if { ![is-loaded python-cffi] } {
module load python-cffi
}
if { ![is-loaded flux-core] } {
module load flux-core
}
# global control file
if { [file exists $env(MODULESHOME)/etc/control] } {
source $env(MODULESHOME)/etc/control
}
# local variables
set name @NAME@
set version @VERSION@
set prefix /opt/${name}-${version}
#
# sched currently does not have ${prefix}/bin
#
prepend-path FLUX_LUA_PATH_PREPEND "${prefix}/share/lua/5.1/?.lua"
prepend-path FLUX_LUA_CPATH_PREPEND "${prefix}/lib64/lua/5.1/?.lua"
prepend-path FLUX_EXEC_PATH_PREPEND "${prefix}/libexec/flux/cmd"
prepend-path FLUX_RC_EXTRA "${prefix}/etc/flux"
FYI flux-core-0.3.0 has been tagged and rpms built for TOSS2 and TOSS3. Apparently we made the TOSS2 production deadline only because of a security update that came in at the last minute. Anyway Trent says the new rpms should roll out on the test systems soon (he thought it might already be on hype though I don't see it.
Great. I will have to work on some other things for now, so this doesn't block me. I will check back on hype in a day or two.
FYI -- Trent has just pushed 0.3.0 to hype login node.
I've been consumed by something else and I will be on an all-day meeting today. I will try to get to this Thu or Fri.
Ok. With minor modifications/fixes to the spec file, a spec rpm build on TOSS2 on flux-core/0.3.0 rpm. I will take this to TOSS2 BuildBot as the next step.
Name: flux-sched
Version: 0.1.0
Release: 1%{?dist}
Summary: Job Scheduling Facility for Flux Resource Manager Framework
Group: System Environment/Base
License: GPLv2+
URL: https://github.com/flux-framework/flux-sched
Source0: %{name}-%{version}.tar.gz
Source1: flux-sched.module
BuildRoot: %{_tmppath}/%{name}-%{version}-root-%(%{__id_u} -n)
#let's not build the debug package for now
%define debug_package %{nil}
#only compress -- no stripping etc
%define __spec_install_post /usr/lib/rpm/brp-compress || :
BuildRequires: flux-core >= 0.3.0
BuildRequires: zeromq4-devel >= 4.1.4
BuildRequires: czmq-devel >= 3.0.2
BuildRequires: json-c-devel
BuildRequires: lua-devel >= 5.1
BuildRequires: lua-posix
BuildRequires: hwloc-devel >= 1.4
Requires: flux-core >= 0.3.0
%description
flux-sched contains the job scheduling facility for the Flux Resource
Manager Framework. It consists of a Flux comms module that handles
all the functionality common to scheduling. The module has the ability
to load one or more scheduling sub-modules that provide specific
scheduling behavior.
%prep
%setup -n %{name}-%{version}
sed -i -e "s|@NAME@|%{name}|" -e "s|@VERSION@|%{version}|" \
%{_sourcedir}/flux-sched.module
%build
# PKG_CONFIG_PATH and PATH should come from flux-core module been
export MODULEPATH=/opt/modules/modulefiles
. /etc/profile.d/[mM]odules.sh
module load python/2.7 python-pycparser python-cffi mvapich2-gnu-shmem
module load flux-core
# We want to make this a relocatable package
# We want to be able to install multiple flux-sched RPMs so we use
# ${name}-${version} install directory
./configure --prefix=/opt/%{name}-%{version}
make %{?_smp_mflags}
export FLUX_TESTS_LOGFILE=t
make check
if [ $? -ne 0 ]; then
cat t/*.out t/*.log
exit 1
fi
%install
# RPM_BUILD_ROOT comes from BuildRoot tag.
rm -rf ${RPM_BUILD_ROOT}
mkdir -p ${RPM_BUILD_ROOT}
make install DESTDIR=${RPM_BUILD_ROOT}
find ${RPM_BUILD_ROOT} -name *.la | while read f; do rm -f $f; done
install -D -m 555 %{_sourcedir}/flux-sched.module \
${RPM_BUILD_ROOT}/opt/modules/modulefiles/%{name}/%{version}
%clean
rm -rf $RPM_BUILD_ROOT
%files
%defattr(-,root,root,-)
%dir /opt/modules/modulefiles/%{name}
/opt/modules/modulefiles/%{name}/%{version}
/opt/%{name}-%{version}
%post
%changelog
* Tue Apr 26 2016 Dong H. Ahn <ahn1@llnl.gov> 0.1.0-1
- Build from initial flux-sched-0.1.0 tag
RPM file:
/nfs/tmp2/dahn/rpm/RPMS/x86_64/flux-sched-0.1.0-1.ch5.4.x86_64.rpm
@dongahn - you can likely drop the zeromq4 BuildRequires as (I believe) no libzmq interfaces are used directly in sched.
Also, I think module load flux-core
brings in its dependencies so you shouldn't have to explicitly load them here.
Any BuildRequires needed to bring in the /opt module machinery? (If you are using @grondo's flux-core toss2 spec file as a guide then you probably are already doing the right thing.)
Hmmmm that explict module load was needed to fix some early build error. i will check again.
@garlick. I misread your comment. I think you are right (i was watching a musical where my son had some role :)
FYI -- we talked about this a bit at the meeting. The current buildbot error:
DEBUG: ERROR: t0001-basic.t - missing test plan DEBUG: ERROR: t0001-basic.t - exited with status
was because a key wasn't generated.
I also moved the make check
into the check
section from the build
section. In addition, as rpmbuild
evaluates one spec line at at time, the cat t/*.out t/*.log
given at the next lines wasn't working. Instead,
%check
export MODULEPATH=/opt/modules/modulefiles
. /etc/profile.d/[mM]odules.sh
module load flux-core
flux keygen
export FLUX_TESTS_LOGFILE=t
make check || (cat t/*.output t/*.log && exit 1)
With this, Buildbot printed out more meaningful debug data:
DEBUG: ERROR: t0001-basic
DEBUG: ==================
DEBUG: flux-broker: flux_sec_zauth_init: The directory '/builddir/.flux' \
does not exist. Have you run `flux keygen`?
DEBUG: flux-broker: flux_sec_zauth_init: The directory '/builddir/.flux' \
does not exist. Have you run `flux keygen`?
DEBUG: flux-start: 0 (pid 41206) exited with rc=1
DEBUG: flux-start: 1 (pid 41207) exited with rc=1
DEBUG: ERROR: t0001-basic.t - missing test plan
DEBUG: ERROR: t0001-basic.t - exited with status 1
With this fix, at least rpms are popped out. I will look through the Buildbot logs a bit more carefully and test them out before moving onto TOSS 3 this afternoon.
hype356{dahn}50: ll /repo/llnl/RHEL6/5.4/RPMS/x86_64 | grep flux-sched
-r--r--r-- 1 531 531 204732 May 16 12:01 flux-sched-0.1.0-1.ch5.4.x86_64.rpm
hype356{dahn}51: ll /repo/llnl/RHEL6/5.4/SRPMS/ | grep flux-sched
-r--r--r-- 1 531 531 859407 May 16 12:01 flux-sched-0.1.0-1.el6.src.rpm
@grondo: I am having trouble in exporting FLUX_LUA_PATH_PREPEND
and FLUX_LUA_CPATH_PREPEND
from within the flux-sched module file, and this is mainly because of the LUA glob. (I might have fallen into an escape hell.)
So, I am wondering if you have been able to find a way to export LUA glob character (?
) for TOSS 2 module.
Just to give you a context, Trent helped installed flux-sched
rpm this morning on hype. But if I type in module load flux-sched
on hype from my shell (tcsh):
hype356{dahn}113: module load flux-sched
/opt/flux-sched-0.1.0/lib64/lua/5.1/?.lua: No match.
# looks like this hangs after this! -- although it is not (I will explain this later.)
I looked at how module
works a bit and i looks like this command uses /usr/bin/modulecmd
underneath to convert the commands in the script like setenv
and prepend-path
into shell-specific environment variable commands/control. But unfortunately it seems its expansion rules are a bit inconsistent in particular in dealing w/ special characters like ?
. For example,
?
in flux-sched.module like:setenv FLUX_LUA_PATH_PREPEND /share/lua/5.1/?.lua
/usr/bin/modulecmd
doesn't interpret/expand it and returns the string as-is to be evaluated by eval
setenv FLUX_LUA_PATH_PREPEND /opt/flux-sched-0.1.0/share/lua/5.1/?.lua
?
like \?
, /usr/bin/modulecmd
does interpret and gets rid of this escape:setenv FLUX_LUA_PATH_PREPEND /share/lua/5.1/\?.lua
/usr/bin/modulecmd doesn't expand it and return the string as is.
setenv FLUX_LUA_PATH_PREPEND /opt/flux-sched-0.1.0/share/lua/5.1/?.lua
setenv FLUX_LUA_PATH_PREPEND /opt/flux-sched-0.1.0/share/lua/5.1/\?.lua
comes out, I think I can make this work. But I have not been successful to find a magic formula yet because of this inconsistency. (I also tried to escape the escape and etc but no avail.).
So If you happened to go throughout this before and found a solution, please let me know!
BTW, I initially thought module load flux-sched
was dead hung. But it turned out, module
command is an alias to:
module: aliased to set _prompt="$prompt";set prompt="";eval `/usr/bin/modulecmd tcsh !*`; set _exit=$status; set prompt="$_prompt";unset _prompt;; /usr/bin/test 0 = $_exit;
and there seems to be a bug in tcsh
's eval
. Once eval
failed, the rest of the shell commands including set prompt="$_prompt"
isn't executed and this gives that look and feel of a 'unkillable' hang!
If you happen to test this, you can come out of this hang-like illusion simply by typing in set prompt="$_prompt"
and enter.
I am not really happy w/ module
today.
I will try to log in later and check on this. I'm afraid I usually forget to test tcsh
and likely there are traps lurking in its idiosyncrasies. Have you tried with bash
or ksh
?
Also, If you are still using prepend-path
for the Lua paths you'll have to specify the delimiter as ;
, the default delimiter for for prepend-path
is colon.
@dongahn -- ok, I logged on to hype and the Lua paths work for me in bash
:
grondo@hype356:~$ env | grep ^FLUX
grondo@hype356:~$ module load flux-sched
grondo@hype356:~$ env | grep ^FLUX
FLUX_LUA_PATH_PREPEND=/opt/flux-sched-0.1.0/share/lua/5.1/?.lua
FLUX_RC_EXTRA=/opt/flux-sched-0.1.0/etc/flux
FLUX_LUA_CPATH_PREPEND=/opt/flux-sched-0.1.0/lib64/lua/5.1/?.lua
FLUX_EXEC_PATH_PREPEND=/opt/flux-sched-0.1.0/libexec/flux/cmd
grondo@hype356:~$ flux start
[1463451754.325170] broker.err[0]: rc1: flux-module: sched: not found in module search path
[1463451754.325619] broker.err[0]: Run level 1 Exited with non-zero status (rc=1)
[1463451754.336362] broker.err[0]: rc3: flux-module: cmb.rmmod[0] sched: No such file or directory
flux-start: 0 (pid 16025) exited with rc=1
The problem above is just a missing FLUX_MODULE_PATH
grondo@hype356:~$ export FLUX_MODULE_PATH=/opt/flux-sched-0.1.0/lib/flux/modules
grondo@hype356:~$ flux start
grondo@hype356:~$ flux module list | grep sched
sched 322570 CB98A7D 20 S 0
grondo@hype356:~$ flux submit /bin/echo hello
submit: Submitted jobid 1
grondo@hype356:~$ flux wreck ls
ID NTASKS STATE START RUNTIME RANKS COMMAND
1 1 complete 2016-05-16T19:25:16 0.011s 0 echo
grondo@hype356:~$ flux wreck attach -l 1
0: hello
Yes I found I needed FLUX_MODULE_PATH. Were you able to repro tcsh issue?
I did reproduce your problem with tcsh
. However, zsh
also works -- might I suggest you switch to that shell? ;-)
Seriously, though I'm not sure why modulecmd
output doesn't quote all setenv
arguments it generates for tcsh. That seems like a bug, but when working with tcsh
who knows? I wonder if you tried wrapping the Lua paths in quote characters if that would help.
Also, my latest module script uses setenv for these two LUA env variables instead of prepend-path, as I should.
Won't setenv overwrite any existing LUA_PATH_PREPEND environment variables? Probably not an issue now, but what if we had two framework projects that each need to use LUA_PATH_PREPEND?
Ah, you arr right. Will stick witn prepend w/ delimiter.
The quote characters didnt work for me though i probably wasnt exhastive.
W/ Toss 2 going away soon, fixing this modulecmd bug isn't probably worth it... Perhaps I just make this work under other shells and move to toss 3?
Yeah, I was thinking the same thing - maybe open a bug and get the release out, and fix later if needed.
On Mon, May 16, 2016 at 8:12 PM, Dong H. Ahn notifications@github.com wrote:
The quote characters didnt work for me though i probably wasnt exhastive.
W/ Toss 2 going away soon, fixing this modulecmd bug isn't probably worth it... Perhaps I just make this work under other shells and move to toss 3?
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/flux-framework/flux-sched/issues/154#issuecomment-219608789
I checked the source for environment-modules and here's where they attempt to escape csh strings:
for(;*in;in++) {
if (*in == ' ' ||
*in == '\t'||
*in == '\\'||
*in == '{' ||
*in == '}' ||
*in == '|' ||
*in == '<' ||
*in == '>' ||
*in == '!' ||
*in == ';' ||
*in == '#' ||
*in == '$' ||
*in == '^' ||
*in == '&' ||
*in == '*' ||
*in == '\''||
*in == '"' ||
*in == '(' ||
*in == ')') {
*out++ = '\\';
}
*out++ = *in;
Sadly, they seem to have forgotten literal ?
, and given that they escape single-quotes, it leaves us no way to actually quote the entire LUA_PATH string or ?
ourselves. (Why didn't they just give an option to quote the whole string??)
I share @dongahn's frustration with environment-modules
.
I think our only option is to patch the environment-modules package. I've already verified that the following patch will allow us to use ?
in setenv
for csh style shells:
diff --git a/utility.c b/utility.c
index 4f1c2e7..c0ea520 100644
--- a/utility.c
+++ b/utility.c
@@ -2752,6 +2752,7 @@ void EscapeCshString(const char* in,
*in == '^' ||
*in == '&' ||
*in == '*' ||
+ *in == '?' ||
*in == '\''||
*in == '"' ||
*in == '(' ||
That is instead of
$ modulecmd tcsh load test
setenv FLUX_LUA_PATH_PREPEND /opt/flux-sched-0.1.0/share/lua/5.1/?.lua ;setenv LOADEDMODULES test/0.1.0 ;
The patched version generates
$ ./modulecmd tcsh load test
setenv FLUX_LUA_PATH_PREPEND /opt/flux-sched-0.1.0/share/lua/5.1/\?.lua ;setenv LOADEDMODULES test/0.1.0 ;
@grondo: Thank you for getting to the bottom of the issue! This agrees with what I observed yesterday.
Given that the current module will be replaced with Lmod in TOSS3 for tce packaging, I'm not sure if we have someone who is willing to accept your patch. I will talk to Trent though. At least they claim the new Lmod-based module can work with an original module file as-is, we should try this case under TOSS3 to see Lmod doesn't have this escape bug. Adding @lee218llnl and @adammoody to give a heads-up.
I was pretty frustrated w/ both module and tcsh idiosynchrosy yesterday (e.g., eval
failure led to that unkillable process look and feel) :-( I hope that the new Lmod-based module will do a better job...
Good news is we won't do module-based packing for flux on TOSS3!
We already run a patched version of environment-modules for TOSS2, so I already have a new version ready to go if it makes sense.
And agreed, we should make sure lmod doesn't have this issue.
@dongahn, I've built a new environment-modules-3.2.10-1.2chaos package for TOSS2 with the above patch applied.
Great. Thanks!
FYI -- I have new rpm builds for TOSS2. I will test this once this and environment-modules-3.2.10-1.2chaos package will be rolled out.
flux-sched /opt RPM for RHEL6.
/repo/llnl/RHEL6/5.4/RPMS/x86_64/flux-sched-0.1.0-2.ch5.4.x86_64.rpm
/repo/llnl/RHEL6/5.4/SRPMS/flux-sched-0.1.0-2.el6.src.rpm
RPMs have been signed with the follwing GPG key:
--------
pub 1024D/D8A1F5EF 2007-09-28
Key fingerprint = 2A4A B485 561B 797F 5EFA E0D8 5BFF 971C D8A1 F5EF
uid CR/LF Builder <chaos-dev@lists.llnl.gov>
sub 2048g/3D3BAB67 2007-09-28
--------
Sincerely, The Builder.
Moving on to koji, it seems tosspkg
isn't really happy at the moment. Is it only me?
opal186{dahn}30: tosspkg clone examplepkg
hangs.
@garlick: did you post your flux-core.spec
somewhere. It would be good if I can compare my adjustments for koji w/ yours.
Dong, you may need to delete your cached credentials in ~/.gitconfig
From: Dong H. Ahn Sent: Saturday, May 21, 2016 7:27:12 AM To: flux-framework/flux-sched Cc: Lee, Greg; Mention Subject: Re: [flux-framework/flux-sched] Questions that may arise in creating the first sched RPM package (#154)
Moving on to koji, it seems tosspkg isn't really happy at the moment. Is it only me?
opal186{dahn}30: tosspkg clone examplepkg
hangs.
@garlickhttps://github.com/garlick: did you post your flux-core.spec somewhere. It would be good if I can compare my adjustments for koji w/ yours.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHubhttps://github.com/flux-framework/flux-sched/issues/154#issuecomment-220780638
@dongahn maybe you can tosspkg clone flux-core
? If not, koji web has a list of packages and I think you can drill down to the spec file that way. If neither of those work I will get it for you but it may be later today before I can get to vpn
flux-framework/distribution#4 may have some useful info also, though I apparently didn't post spec there.
@lee218llnl: Great! This fixed the issue and now I can see the flux-core rpm!
It seem I hit a permission issue with koji packaging on opal. I've sent an email to @foraker on this:
opal186{dahn}35: koji add-pkg ch6-chaotic flux-sched --owner=dahn
ActionNotAllowed: policy violation (package_list)
I will be on travel next week. But since this should be simple enough, I will try to complete this next week. My new untested spec file and Makefile:
Name: flux-sched
Version: 0.1.0
Release: 3%{?dist}
Summary: Job Scheduling Facility for Flux Resource Manager Framework
Group: System Environment/Base
License: GPLv2+
URL: https://github.com/flux-framework/flux-sched
Source0: %{name}-%{version}.tar.gz
BuildRoot: %{_tmppath}/%{name}-%{version}-root-%(%{__id_u} -n)
#let's not build the debug package for now
%define debug_package %{nil}
#only compress -- no stripping etc
%define __spec_install_post /usr/lib/rpm/brp-compress || :
BuildRequires: flux-core >= 0.3.0
BuildRequires: czmq-devel >= 3.0.2
BuildRequires: json-c-devel
BuildRequires: lua-devel >= 5.1
BuildRequires: lua-posix
BuildRequires: hwloc-devel >= 1.4
BuildRequires: libuuid-devel
Requires: flux-core >= 0.3.0
Requires: libuuid
%description
flux-sched contains the job scheduling facility for the Flux Resource
Manager Framework. It consists of a Flux comms module that handles
all the functionality common to scheduling. The module has the ability
to load one or more scheduling sub-modules that provide specific
scheduling behavior.
%prep
%setup -n %{name}-%{version}
%build
./configure
make %{?_smp_mflags}
%check
flux keygen
export FLUX_TESTS_LOGFILE=t
make check || (cat t/*.output t/*.log && exit 1)
%install
# RPM_BUILD_ROOT comes from BuildRoot tag.
rm -rf ${RPM_BUILD_ROOT}
mkdir -p ${RPM_BUILD_ROOT}
make install DESTDIR=${RPM_BUILD_ROOT}
find ${RPM_BUILD_ROOT} -name *.la | while read f; do rm -f $f; done
%clean
rm -rf $RPM_BUILD_ROOT
%files
%defattr(-,root,root,-)
%{_sysconfdir}/flux/rc1.d/sched-start
%{_sysconfdir}/flux/rc3.d/sched-stop
%{_includedir}/flux/sched
%{_libdir}/flux/modules/sched
%{_libdir}/libflux-rdl.so*
%{_libdir}/lua/5.1/flux/cpuset.so
%{_libexecdir}/flux/cmd/flux-rdltool
%{_libexecdir}/flux/cmd/flux-waitjob
%{_datadir}/lua/5.1/middleclass.lua
%{_datadir}/lua/5.1/RDL.lua
%{_datadir}/lua/5.1/RDL
%post
%changelog
* Sun May 22 2016 Dong H. Ahn <ahn1@llnl.gov> 0.1.0-3
- Adjustment for TOSS3 deployment
Remove the use of environmental modules
Adjust for installing into default system directories.
* Thu May 19 2016 Dong H. Ahn <ahn1@llnl.gov> 0.1.0-2
- Minor adjustment to flux-sched.module
* Sun May 15 2016 Dong H. Ahn <ahn1@llnl.gov> 0.1.0-1
- Build from initial flux-sched-0.1.0 tag
TAG := 0.1.0
TARBALL:= flux-sched-$(TAG).tar.gz
URL := https://github.com/flux-framework/flux-sched/releases/download/$(TAG)/$(TARBALL)
sources:
rm -f $(TARBALL)
wget $(URL)
clean:
rm -f $(TARBALL)
.PHONY: sources
.PHONY: clean
OK, now I also have a TOSS3 sched rpm to test:
Package: flux-sched-0.1.0-3.ch6
Tag: ch6-chaotic
Status: complete
Built by: dahn
ID: 862
Started: Mon, 23 May 2016 19:55:57 PDT
Finished: Mon, 23 May 2016 19:57:29 PDT
Changelog:
* Mon May 23 2016 Dong H. Ahn <ahn1@llnl.gov> 0.1.0-3
- Adjustment for TOSS3 deployment
Remove the use of environmental modules
Adjust for installing into default system directories.
* Thu May 19 2016 Dong H. Ahn <ahn1@llnl.gov> 0.1.0-2
- Minor adjustment to flux-sched.module
* Sun May 15 2016 Dong H. Ahn <ahn1@llnl.gov> 0.1.0-1
- Build from initial flux-sched-0.1.0 tag
SRPMS:
flux-sched-0.1.0-3.ch6.src.rpm
Closed tasks:
-------------
Task 11233 on builder2-x86.buildfarm
Task Type: tagBuild (noarch)
Task 11229 on builder1-x86.buildfarm
Task Type: build (ch6-chaotic, /buildfarm/flux-sched:7effc332bbea64af54a6cf8c33c25b497e951c36)
Task 11230 on builder2-x86.buildfarm
Task Type: buildSRPMFromSCM (/buildfarm/flux-sched:7effc332bbea64af54a6cf8c33c25b497e951c36)
logs:
http://tossbuild.llnl.gov/koji/getfile?taskID=11230&name=build.log
http://tossbuild.llnl.gov/koji/getfile?taskID=11230&name=checkout.log
http://tossbuild.llnl.gov/koji/getfile?taskID=11230&name=mock_output.log
http://tossbuild.llnl.gov/koji/getfile?taskID=11230&name=root.log
http://tossbuild.llnl.gov/koji/getfile?taskID=11230&name=state.log
Task 11231 on builder2-x86.buildfarm
Task Type: buildArch (flux-sched-0.1.0-3.ch6.src.rpm, x86_64)
logs:
http://tossbuild.llnl.gov/koji/getfile?taskID=11231&name=build.log
http://tossbuild.llnl.gov/koji/getfile?taskID=11231&name=mock_output.log
http://tossbuild.llnl.gov/koji/getfile?taskID=11231&name=root.log
http://tossbuild.llnl.gov/koji/getfile?taskID=11231&name=state.log
rpms:
https://tossbuild.llnl.gov/kojifiles/packages/flux-sched/0.1.0/3.ch6/x86_64/flux-sched-0.1.0-3.ch6.x86_64.rpm
Task Info: http://tossbuild.llnl.gov/koji/taskinfo?taskID=11229
Build Info: http://tossbuild.llnl.gov/koji/buildinfo?buildID=862
Note as a future reference:
RPM build errors:
error: File not found: /builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/etc/flux/rc1.d/sched-start
error: File not found: /builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/etc/flux/rc3.d/sched-stop
error: File not found: /builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/include/flux/sched
error: File not found: /builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/lib64/flux/modules/sched
error: File not found by glob: /builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/lib64/libflux-rdl.so*
error: File not found: /builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/lib64/lua/5.1/flux/cpuset.so
error: File not found: /builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/libexec/flux/cmd/flux-rdltool
error: File not found: /builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/libexec/flux/cmd/flux-waitjob
Apparently, ./configure in my spec file was installing scheds into /usr/local area, which is the default, for example,
make[2]: Entering directory `/builddir/build/BUILD/flux-sched-0.1.0/etc'
make[2]: Nothing to be done for `install-exec-am'.
/usr/bin/mkdir -p '/builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/local/etc/flux/rc1.d'
/usr/bin/install -c sched-start '/builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/local/etc/flux/rc1.d'
/usr/bin/mkdir -p '/builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/local/etc/flux/rc3.d'
/usr/bin/install -c sched-stop '/builddir/build/BUILDROOT/flux-sched-0.1.0-3.ch6.x86_64/usr/local/etc/flux/rc3.d'
I realized I had to change ./configure
to %configure
so that koji can pass in the correct system install path!
Nice work! And thanks for highlighting your battle scars.
FYI -- I checked opal (TOSS3) just before it was powered down. flux-sched
has been deployed and seems working now.
opal186{dahn}: rpm -qa | grep flux
flux-sched-0.1.0-3.ch6.x86_64
flux-core-0.3.0-1.ch6.x86_64
cab86{dahn}: rpm -ql flux-sched-0.1.0-3.ch6.x86_64
/etc/flux/rc1.d
/etc/flux/rc1.d/sched-start
/etc/flux/rc3.d
/etc/flux/rc3.d/sched-stop
/usr/include/flux/sched
/usr/include/flux/sched/rdl.h
/usr/lib64/flux/modules/sched
/usr/lib64/flux/modules/sched.so
/usr/lib64/flux/modules/sched/sched_backfill.so
/usr/lib64/flux/modules/sched/sched_fcfs.so
/usr/lib64/libflux-rdl.so
/usr/lib64/libflux-rdl.so.0
/usr/lib64/libflux-rdl.so.0.0.0
/usr/lib64/lua/5.1/flux/cpuset.so
/usr/libexec/flux/cmd/flux-rdltool
/usr/libexec/flux/cmd/flux-waitjob
/usr/share/lua/5.1/RDL
/usr/share/lua/5.1/RDL.lua
/usr/share/lua/5.1/RDL/Resource.lua
/usr/share/lua/5.1/RDL/ResourceData.lua
/usr/share/lua/5.1/RDL/lib
/usr/share/lua/5.1/RDL/lib/ListOf.lua
/usr/share/lua/5.1/RDL/memstore.lua
/usr/share/lua/5.1/RDL/serialize.lua
/usr/share/lua/5.1/RDL/types
/usr/share/lua/5.1/RDL/types/Node.lua
/usr/share/lua/5.1/RDL/types/Socket.lua
/usr/share/lua/5.1/RDL/uri.lua
/usr/share/lua/5.1/RDL/uuid.lua
/usr/share/lua/5.1/middleclass.lua
opal186{dahn}: flux start -s 2
opal186{dahn}: flux module list
Module Size Digest Idle S Nodeset
kvs 3165080 B31A075 0 S 0
content-sqlite 2991661 8508D48 12 S 0
resource-hwloc 2998996 EA42108 12 S 0
wrexec 2977729 D65B456 12 S 0
connector-local 3008762 179740B 0 R 0
sched 383793 EA31F73 12 S 0
mecho 2965184 D65AA74 12 S 0
job 2996487 6037479 12 S 0
barrier 2988886 D9E5F0E 12 S 0
opal186{dahn}: exit
opal186{dahn}: setenv FLUX_SCHED_RC_NOOP 1
opal186{dahn}:
opal186{dahn}:
opal186{dahn}: flux start -s 2
opal186{dahn}: flux module list
Module Size Digest Idle S Nodeset
kvs 3165080 B31A075 0 S 0
content-sqlite 2991661 8508D48 1 S 0
resource-hwloc 2998996 EA42108 1 S 0
wrexec 2977729 D65B456 1 S 0
connector-local 3008762 179740B 0 R 0
mecho 2965184 D65AA74 1 S 0
job 2996487 6037479 1 S 0
barrier 2988886 D9E5F0E 1 S 0
opal186{dahn}: unsetenv FLUX_SCHED_RC_NOOP
opal186{dahn}: flux -v start -s3
FLUX_CONF_DIRECTORY=/g/g0/dahn/.flux
LUA_PATH=/usr/share/lua/5.1/?.lua;;;
FLUX_EXEC_PATH=/usr/libexec/flux/cmd
FLUX_SEC_DIRECTORY=/g/g0/dahn/.flux
PYTHONPATH=/usr/lib64/python2.7/site-packages
LUA_CPATH=/usr/lib64/lua/5.1/?.so;;;
FLUX_CONNECTOR_PATH=/usr/lib64/flux/connectors
FLUX_MODULE_PATH=/usr/lib64/flux/modules
sub-command search path: /usr/libexec/flux/cmd
flux: trying to exec /usr/libexec/flux/cmd/flux-start
opal186{dahn}21: flux submit -N3 -n3 hostname
submit: Submitted jobid 1
opal186{dahn}22: flux kvs dir lwj.1
lwj.1.state = complete
lwj.1.cmdline = [ "hostname" ]
lwj.1.nnodes = 3
lwj.1.input.
lwj.1.walltime = 0
lwj.1.cwd = /g/g0/dahn
lwj.1.ntasks = 3
lwj.1.rdl = <CUT>
Hmmm on TOSS2, there seems to be an error in the new module script. Will revisit this.
hype356{dahn}27: module load flux-sched
cmdPath.c(159):ERROR:11: Usage is 'prepend-path path-variable directory'
flux-sched/0.1.0(32):ERROR:102: Tcl command execution failed: prepend-path --delim=; FLUX_LUA_PATH_PREPEND ${prefix}/share/lua/5.1/?.lua
@dongahn: You may have to quote the ;
argument to --delim
, i.e. try --delim ";"
Thanks @grondo. Yes the problem was along that line. It seems escaping ;
is the way to go. (Just as a future reference quoting it gave me the same error). I found a usage page at sourceforge, ... hopefully this is the right page :-(
prepend-path [ -d C | --delim C | --delim=C ] variable value
Append or prepend value to environment variable. The variable is a colon, or delimiter, separated list such as
"PATH=directory:directory:directory". The default delimiter is a colon ':', but an arbitrary one can be given by the --delim option. For example a space can be used instead (which will need to be handled in the Tcl specially by enclosing it in " " or { }). A space, however, can not be specified by the --delim=C form.
I tested this by adding the local path where I have the module file to MODULEPATH
:
hype356{dahn}42: setenv MODULEPATH /nfs/tmp2/dahn/lc_rpm/chaos5.4/flux-sched:/usr/share/Modules/modulefiles:/etc/modulefiles:/opt/modules/modulefiles
Then,
hype356{dahn}47: module load flux-sched.module
/opt/flux-sched-0.1.0/lib64/lua/5.1/?.lua: No match.
Got back to that old state -- the environment module bug disallowing escaping ?
I will create a new sched package with this fix. Hopefully, once your bug fix version of environment module rpm (environment-modules-3.2.10-1.2chaos
) is installed, everything works seamlessly.
The turn-around time of this is a bit bothering me though, but oh well.
I talked with Mark offline this morning. I thought his suggestion was good to create an issue and track questions/issues that I may come across as I will be creating the first sched RPM and a release for sched for /opt on our systems.