Open jamesaoverton opened 5 years ago
Hmm, this seems to be moving ROBOT into a crowded space e.g CWL, CWLRunner, Galaxy, NextFlow. Many of these support APIs with similar functionality, e.g. WES. Will provide more details later....
Sure, I'd love to hear more about alternatives.
What I really want is easier access to the functionality we already have, with as thin a wrapper as we can manage. I'm not interested in competing with these projects. We have an issue mentioning CWL #37.
Is there something out there that can talk to a reasoner over a "wire" (HTTP being one example)?
Use Case: My group has a cell name and marker validator written in Python. We'd like ROBOT to load CL, run a reasoner, keep it running, and have little webapp written in Python interactively ask questions about whether new class expressions are satisfiable. (It should be possible to formulate these questions as DL queries, but I'm not yet certain.)
also
Is there something out there that can talk to a reasoner over a "wire" (HTTP being one example)?
Have you seen @balhoff's https://github.com/phenoscape/owlery - it seems to fit this use case perfectly.
see https://owlery.phenoscape.org/api/
for swagger
We have an issue mentioning CWL #37.
Yep. Still not sure if CWL is a good fit for ontology workflow tasks but its usage is increasing rapidly in other projects I am on. In the context of this ticket I was thinking specifically of the TES API which seems to fit the kind of REST operations you want to do here:
https://github.com/ga4gh/task-execution-schemas/blob/master/README.md
I think this is a potentially important and useful feature, deserving of more serious consideration than the bitty responses I am providing here. Shall we schedule some time at ICBO to talk about some of this, or do you need something before then?
Thanks. owlery does sound like a good fit for the immediate use case. I'll check that out.
My proposal here is in interactive in a way that these general task runners are not. In those systems you trigger a job, it runs to completion, and the only interactive thing you can do is cancel the job. In my proposal here you keep the ROBOT CommandState
in memory, where it can be queried, and you can decide what to do next.
Goal: To be able to use ROBOT via an HTTP REST API from any programming language.
Use Case: My group has a cell name and marker validator written in Python. We'd like ROBOT to load CL, run a reasoner, keep it running, and have little webapp written in Python interactively ask questions about whether new class expressions are satisfiable. (It should be possible to formulate these questions as DL queries, but I'm not yet certain.)
We could build something just for this use case, but I have an idea for a more general solution.
Approach: Create a
robot-rest
system that wrapsrobot-command
and presents an HTTP REST interface. Each "job" runs a chain of ROBOT commands with a workspace of files. Each "task" is a command in the chain with an execution log. So each job will create and maintain aCommandState
object, each task will runCommandManager.executeCommand()
, and wait for another task until astop
command, which will unload theCommandState
.Here's an example of starting a job, working with files, running a task, then stopping and deleting files. These paths would be prefixed with something like
http://localhost:2019
.GET /jobs
-- show the list of jobs and their status (running, stopped, deleted)POST /jobs
-- create a new "job", return/redirect to a new job ID "123"GET /jobs/123
-- get lists of tasks and files for this jobGET /jobs/123/files
-- get a list of files in the workspace for this job (sizes, checksums, dates)PUT /jobs/123/files/bar.owl
-- uploadbar.owl
to the workspace for this jobPUT /jobs/123/files/bar.owl?fetch=true
-- fetch a file from the POSTed URL and save it asbar.owl
to the workspaceGET /jobs/123/tasks
-- the list of tasks executed and status: currently 0 tasks and "running"POST /jobs/123/tasks?command=convert&input=bar.owl&output=baz.owl
-- runrobot convert --input bar.owl --output baz.owl
inside the workspace, immediately return/redirect to task ID "1"GET /jobs/123/tasks/1
-- see task status, STDOUT+STDERRGET /jobs/123/files/baz.owl
-- download thebaz.owl
fileGET /jobs/123/views/baz.owl
-- view thebaz.owl
file (not sure about 'views' name)POST /jobs/123/tasks?command=stop
-- stop this job, which will unload theCommandState
from memory and reject further tasks, return/redirect to new task ID "2"DELETE /jobs/123/files/baz.owl
-- delete thebaz.owl
fileDELETE /jobs/123
-- delete a job and its files, keeping only some metadataI'm hoping that this can be a thin layer that works with very few modifications to
robot-command
. The trick is to translate the HTTP query string into the command-line arguments that each ROBOT command expects. HTTP query strings will not map perfectly on to ROBOT command options, but maybe well enough. While the sequence of commands/tasks is significant, the sequence of options for a single command is not. Some options can be specified multiple times: query string should allow repeated keys, but if that doesn't work then I think we could support a single value that is an array.There are a few cases where I would want to modify existing
robot-command
code. Thequery
command would be much more useful if we presented a SPARQL web form and allowed a bunch of queries without reloading Jena.GET /jobs/123/tasks/4/sparql?select=some-sparql-query
-- if task 4 wasquery
, and it's the current task running, then provide access to Jena, run some SPARQL query, and return resultsFor a long time I've been thinking of adding a
--server
option to the command-line version ofquery
that would wait and accept interactive queries until the user hits Ctrl-D or something (#25). This would build on that. Our use case requires something similar for DL queries, which is another feature we've wanted for a long time (#387).Feedback and other use cases would be appreciated.