Closed obigriffith closed 10 years ago
Notes from offline discussion from @sakoht and @mkiwala-g
sGMS is failing to "genome model build stop" b/c it is missing Job::Iterator. Where does that comes from? Is it home grown, but not in genome.git?
I had never heard of that class before. As a first attempt to find it, I tried this command from my TGI workstation:
> genome-perl -e'use Job::Iterator'
> Can't locate Job/Iterator.pm in @INC (@INC contains: /gapp/noarch/lib/perl5 /gsc/scripts/opt/genome/current/user/lib/perl/x86_64-linux-gnu-thread-multi /gsc/scripts/opt/genome/current/user/lib/perl /gsc/scripts/lib/perl /gsc/scripts/gsc/info/lib /etc/perl /usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl .) at -e line 1.
Which made me think the “Job::Iterator” package was being defined inside the file for another package. I searched the genome/gms-core on github for “Job::Iterator”, which lead me to a package definition within lib/perl/Genome/Sys/LSF/JobIterator.pm.
It looks like the attempt to use Job::Iterator is here (Genome/Model/Build.pm):
sub _get_job {
use Genome::Model::Command::Services::Build::Scan;
my $self = shift;
my $job_id = shift;
my @jobs = ();
my $iter = Job::Iterator->new($job_id);
while (my $job = $iter->next) {
push @jobs, $job;
}
if (@jobs > 1) {
$self->error_message("More than 1 job found for this build? Alert apipe");
return 0;
}
return shift @jobs;
}
It is not yet clear to me how this is working within TGI right now but I believe the problem is resolved in the sGMS by loading the missing module in place as follows within _get_job
of Genome/Model/Build.pm
:
sub _get_job {
use Genome::Model::Command::Services::Build::Scan;
use Genome::Sys::LSF::JobIterator;
my $self = shift;
my $job_id = shift;
my @jobs = ();
my $iter = Job::Iterator->new($job_id);
while (my $job = $iter->next) {
push @jobs, $job;
}
if (@jobs > 1) {
$self->error_message("More than 1 job found for this build? Alert apipe");
return 0;
}
return shift @jobs;
}
I would like to better understand what is going on with the namespace/inheritance of these classes before pushing this to master...
It looks like use Genome::Sys::LSF::JobIterator;
is actually already in master. Perhaps was missed during the merge of this module where some sGMS specific code exists because of differences between the behavior of LSF and OpenLava.
I added the missing line to the gms-pub branch of gms-core here: https://github.com/genome/gms-core/commit/82ea26170a6da27b87519c65a6898c7a28931ef6
This issue is now resolved. Running builds are now abandoned cleanly even if these have running jobs on OpenLava. Closing.
Currently, if I attempt to abandon or stop a running build I see the following failure: