Add async capabilities - Githubissues

janxious commented 9 years ago

I would like to come up with a nice interface for this, and then steal it for the ruby driver, which was implemented in a bad way and is in dire need of a rewrite.

The most straightforward is to just alter fetchDocument to busy wait for async docs, but I don't think that is a reasonable or good idea.

A slightly more exploded version of that bad idea would look like:

$asyncRequest = apiWrapper.fetchDocumentAsync
$asyncRequest.awaitCompletion(timeout)
$asyncRequest.fetchDocument(filename)

Probably want something easier to deal with asynchronously:

$statusId = new ApiWrapper(["api_key" => "something", "async" => true]).fetch
storeStatusIdInDbOrRenderToPageOrWhatever($statusId)

# and then on the next request

$wrapper = ApiWrapper.new(["api_key" => "something", "async" => true, "statusId" => $statusId])
if $wrapper.asyncCompleted {
  $wrapper.fetchDocument($filename, $statusId)
} else if $wrapper.asyncFailed || $wrapper.asyncTimedOut {
  # deal with it
}

This is all fairly half-assed, so I'm hoping you have some ideas to improve it. :) Also I am gonna tag in some of other people from my org to get their thoughts. @netshade @jason-o-matic

markushausammann commented 9 years ago

In the case of PHP it's probably best to save the initial response, provide a hook endpoint and then fetch the resulting file. Long running requests aren't really a thing in PHP. Websockets are not straightforward. Probably the library doesn't need to provide all of this. I'll think about it too.

netshade commented 9 years ago

I'm not sure I can think of any way to preserve async doc info that wouldn't involve adding external machinery ( like a db, SHM, or something ) or doing something /very/ un-driver that would make this code a lot more complex.

Maybe adding the option to do a busy-wait as you suggested @janxious, and let the caller decide what the best technique to use is based on their usage. ( Though it seems like if they are using async to busy-wait, they may as well just use the traditional DocRaptor call )

janxious commented 9 years ago

well, except the timeouts for async/sync are wildly different

markushausammann commented 9 years ago

As I understand, waiting is not the main difference here. With async you basically get a token which you can later use to retrieve the produced document. That's a very different paradigm from a single roundtrip request.

janxious commented 9 years ago

Yeah, we could hide all that complexity in this wrapper, but I think it's probably going to go badly for anyone who uses it outside of a CLI/Desktop env.

markushausammann commented 9 years ago

I think how to store the tokens should be left to the the individual implementation. We can provide:

requestDocumentAsync() which makes the call and returns the token information
checkStatus() which takes the token information and returns the status
fetchAsyncDocument() which takes the token information and fetches the document

Everything else is not the responsibility of an API wrapper IMO. Some might store the tokens in a relational DB and use cron to check stati, some might add a task to a queue, others might just let the user poll for the document again, etc.

markushausammann commented 9 years ago

I don't know if that's too obvious but it feels like we're not totally on the same page. I'm a web dev, after basic with 16 I've never developed a desktop application. Also, I'm mainly a PHP developer so I'm very close to the HTTP protocol which is stateless and therefore things like long polling, waiting, persistent connection, etc. are very far away from my mind (and any other PHP dev's). We do things with call back hooks, queues, cron, etc.

markushausammann commented 9 years ago

Actually the job queues are even strange to us and we outsource them to workers not based on PHP. :)

janxious commented 9 years ago

Oh, I think implementing this as a hidden feature inside fetchDocument is the worst idea. I mostly brought it up for lulz. As far as busy-wait polling, that is only a good solution if there is a nice PHP evented server like EventMachine (ruby) or node.js.

The correct solution is what you've suggested, which is to me just a more direct interpretation of the API than the pseudo-php I wrote at the end of my first comment.

Anyway, we are of one mind in terms of how to handle this, which is to make it an easy-to-use PHP wrapper that is backed by the DR API.

rsands2801 commented 9 years ago

I have done a basic integration of this. I dont think it needs complicated with fancy calls, etc. It is the same as setting test = true, etc. setAsync() setCallbackUrl()

Everything else gets handled as is, if the user provided a filename it writes it to the file, otherwise they get the raw json response and let them handle it as they see fit

I think @markushausammann is correct, let the user deal with their implementation in their script - we for one will use a callback URL for our framework to let docraptor tell us when its ready.

janxious commented 9 years ago

Partially completed by #39. Still need to do some more work around wrapping the download endpoint.

janxious commented 8 years ago

Just for more information, there is now another PHP client that has first class async: https://github.com/DocRaptor/docraptor-php

Anyone who comes to this issue and needs that functionality should consider using the other lib.

markushausammann commented 8 years ago

A pitty these guys didn't create a pull request instead of writing their own implementation. If the other one is generally good/better, we can also abandon this effort here.

janxious commented 8 years ago

Those guys are us. :smile:

Basically a few months after we finished major development on this PHP library we started working on a cross-language client project. I prefer the style of this particular PHP library and think it has a lot of niceties relative to the other, but the other is generated automatically and the various languages emitted by the codegen (swagger) have a very similar usage pattern/internal API/whatever, so it makes it easier on support in many ways.

If you have any feedback on the other lib, let me/us know.

markushausammann commented 8 years ago

:-) hehe, all right. If those guys are you guys, you guys probably know what those guys are doing.

janxious commented 8 years ago

Let me state that in a more positive light. You should definitely check out that other library and see if it would also meet your needs because it is going to be more current with DocRaptor internal development in the future. We are not going to drop official support for this one because it has been used in many PHP integrations, but I might throw a deprecation message on it and directly and strongly encourage new users to use the other one. :smiley_cat:

markushausammann commented 8 years ago

Sounds good, we're going to look into the other one too.

rsands2801 commented 8 years ago

@janxious Does the other library have ability to set the callback URL?

janxious commented 8 years ago

Yup! https://github.com/DocRaptor/docraptor-php/blob/554fb9f0b344de43a5915397b64a4544dbf1029c/lib/Doc.php#L105

If you based it off https://github.com/DocRaptor/docraptor-php/blob/master/examples/async.php, you could alter it to something like…

# …stuff
$doc->setCallbackUrl("https://example.com/callback")
$create_response = $docraptor->createAsyncDoc($doc);

expectedbehavior / php-docraptor

Add async capabilities #26