restify / clients

HttpClient, StringClient, and JsonClient extracted from restify
MIT License
57 stars 34 forks source link

Explore optional transform hook #198

Closed michaelnisi closed 2 years ago

michaelnisi commented 6 years ago

RFC

Large bodies, XML feeds for example, must be processed as streams. Here , I’m exploring a way to optionally allow hooking a transform stream into the string client. I chose Transform to give users control over accumulation. I offer this simple approach for discussion, don’t merge. 🧐

misterdjules commented 6 years ago

@michaelnisi I'd like to understand your use case(s) better: why not using the HttpClient class and building a pipeline of streams from the response stream instead of using the StringClient class?

michaelnisi commented 6 years ago

@misterdjules Extending HttpClient would be an option, in fact while doing just that, adding StreamClient, I realized that the difference is minimal. So, to leverage the hardened and well-used StringClient, I suggest this optional transform stream hook to give users control over buffering.

My use case is nothing special. Receive large amounts of data and efficiently stream it to a database without saturating memory, XML to JSON to LevelDB. I have a custom HTTP client that does that, but I would like to participate in the robustness of restify and remove the HTTP portion of my code.

misterdjules commented 6 years ago

@michaelnisi By using the HttpClient class, I didn't mean extending it but rather creating HttpClient instances and piping their response stream to any other stream(s) or streams pipeline. Something along those lines:

var httpClient = clients.createHttpClient({
    url: 'http://localhost/'
});

httpClient.get('/', function (getErr, req) {
    req.on('result', function (resultErr, res) {
        // Use res.pipe or any other mechanism to build a pipeline that performs
        // the appropriate operations on the response.
    });
});

My understanding is that the StringClient and JsonClient classes are meant to be used for very specific use cases that are sufficiently common to justify exposing them as built-in client types. Any other custom use case seems to be addressed by using the HttpClient class and using the request and response streams it exposes.

michaelnisi commented 6 years ago

You are probably right. Of course, I see the elegance of keeping StringClient pure. Having to repeat some of its code in the custom pipeline may be an OK price for that. 😏

misterdjules commented 6 years ago

@michaelnisi Looking again at this, I agree it's unfortunate that consumers of Restify clients would need to rewrite most of the logic in StringClient/JsonClient when writing functional equivalents that stream data instead of buffering it.

I'm not sure passing an optional transform stream to StringClient would be the approach I would recommend though, but didn't have time to come up with a rationale for why that is or with an alternative that I prefer.

I'll keep thinking a bit more about this and come back with other proposals for us to consider if that works for you. In the meantime let's keep this discussion going and feel free to add comments or questions.

Thanks again for starting the discussion 👍

michaelnisi commented 6 years ago

I know. My idea is kind of tempting because it’s minimal—low hanging fruit. 🐍🍎

misterdjules commented 6 years ago

@michaelnisi Actually, I don't know when I'll be able to get back to exploring design alternatives, but somehow I feel like we need to explore them and present why we want to go with the one you proposed (or any other one).

Do you have time to write something up about various ways to design this feature with the pros and cons of each of them, and why you'd advocate for going with the one you suggested in this PR?

Some of the alternatives I can think of (list is not exhaustive, and this is just me thinking out loud so not all suggestions necessarily make sense):

What do you think?

michaelnisi commented 6 years ago

@misterdjules Great, let’s keep this ball rolling. I’ll analyse the options, probably leading to a minimally-invasive proposal, which is kind of my stand at the moment.