UQ-RCC / nimrodg

Nimrod/G
https://rcc.uq.edu.au/nimrod
Apache License 2.0
1 stars 0 forks source link

"java.sql.SQLException: No such command" when using SQLite backend. #38

Closed vs49688 closed 4 years ago

vs49688 commented 4 years ago

Sometimes Nimrod will exit with java.sql.SQLException: No such command. Only seen with sqlite.

Caused by DBExperimentHelpers#getCommandIdForResult() being called with cmdIndex == -1. Traced to JobScheduler#recordCommandResult() entering NimrodMasterAPI with an invalid argument.

Doesn't happen Postgres because this in _exp_t_command_result_add():

/* If NULL or negative command index, assume the next one. */
IF NEW.command_index IS NULL OR NEW.command_index < 0 THEN
    SELECT COALESCE(MAX(command_index) + 1, 0) INTO NEW.command_index FROM nimrod_command_results WHERE attempt_id = NEW.attempt_id;
END IF;

I'm not sure whether or not this behaviour is correct. Further investigation is required.

Stack Trace:

au.edu.uq.rcc.nimrodg.api.NimrodException$DbError: java.sql.SQLException: No such command
    at au.edu.uq.rcc.nimrodg.impl.sqlite3.SQLite3DB.makeException(SQLite3DB.java:574) ~[nimrodg-impl-sqlite3-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.impl.sqlite3.SQLite3DB.makeException(SQLite3DB.java:71) ~[nimrodg-impl-sqlite3-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.impl.base.db.SQLUUUUU.runSQLTransaction(SQLUUUUU.java:65) ~[nimrodg-impl-base-db-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.impl.base.db.TempNimrodAPIImpl.addCommandResult(TempNimrodAPIImpl.java:372) ~[nimrodg-impl-base-db-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.master.Master$_JobOperations.recordCommandResult(Master.java:699) ~[nimrodg-master-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.master.sched.DefaultJobScheduler.onJobFailure(DefaultJobScheduler.java:178) ~[nimrodg-master-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.master.Master$_AgentOperations.lambda$reportJobFailure$10(Master.java:908) ~[nimrodg-master-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.master.Master.lambda$processQueue$18(Master.java:568) ~[nimrodg-master-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at java.util.ArrayList.forEach(ArrayList.java:1541) ~[?:?]
    at au.edu.uq.rcc.nimrodg.master.Master.processQueue(Master.java:568) ~[nimrodg-master-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.master.Master.startProc(Master.java:456) ~[nimrodg-master-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.master.Master.tick(Master.java:315) [nimrodg-master-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.cli.commands.MasterCmd.execute(MasterCmd.java:157) [main/:?]
    at au.edu.uq.rcc.nimrodg.cli.NimrodCLICommand.execute(NimrodCLICommand.java:43) [main/:?]
    at au.edu.uq.rcc.nimrodg.cli.DefaultCLICommand.execute(DefaultCLICommand.java:43) [main/:?]
    at au.edu.uq.rcc.nimrodg.cli.NimrodCLI.cliMain(NimrodCLI.java:125) [main/:?]
    at au.edu.uq.rcc.nimrodg.cli.NimrodCLI.main(NimrodCLI.java:145) [main/:?]
Caused by: java.sql.SQLException: No such command
    at au.edu.uq.rcc.nimrodg.impl.sqlite3.DBExperimentHelpers.getCommandIdForResult(DBExperimentHelpers.java:857) ~[nimrodg-impl-sqlite3-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.impl.sqlite3.DBExperimentHelpers.addCommandResult(DBExperimentHelpers.java:864) ~[nimrodg-impl-sqlite3-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.impl.sqlite3.SQLite3DB.addCommandResult(SQLite3DB.java:429) ~[nimrodg-impl-sqlite3-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.impl.base.db.TempNimrodAPIImpl.lambda$addCommandResult$38(TempNimrodAPIImpl.java:372) ~[nimrodg-impl-base-db-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    at au.edu.uq.rcc.nimrodg.impl.base.db.SQLUUUUU.runSQLTransaction(SQLUUUUU.java:50) ~[nimrodg-impl-base-db-1.9.0-100-0c4fbaff-longspawn-dirty.jar:?]
    ... 14 more