eBay / parallec

Fast Parallel Async HTTP/SSH/TCP/UDP/Ping Client Java Library. Aggregate 100,000 APIs & send anywhere in 20 lines of code. Ping/HTTP Calls 8000 servers in 12 seconds. (Akka) www.parallec.io
Apache License 2.0
805 stars 173 forks source link

Save Response Headers into the ResponseOnSingleTask #24

Closed jeffpeiyt closed 8 years ago

jeffpeiyt commented 8 years ago

The response headers are not commonly used. However this may be useful when checking content type or keep alive etc.

brunoribeiro commented 8 years ago

Is there any workaround to get the headers from the ResponseOnSingleTask?

jeffpeiyt commented 8 years ago

@brunoribeiro thanks so much for trying parallec.

The code changes should be straightforward, but I do not see a workaround directly.

How soon will you need this feature? Let me plan the dev work accordingly.

brunoribeiro commented 8 years ago

@jeffpeiyt Thank you for the quick reply. I would love to have it before November, as i'm evaluating parallec for a new feature to be release in December. Can you clarify me the needed code changes to make it work immediately?

jeffpeiyt commented 8 years ago

EDITED.

Sure. Roughly the following. I will see if I can get them by the end of this week.

  1. Add the similar code as follows into http worker. Then try to save the needed part of the original header map FluentCaseInsensitiveStringsMap into the ResponseOnSingeRequest . probably using Map<String, List<String>> to save the header map.
  2. Add a this header map of responseHeaders into ResponseOnSingeRequest and ResponseOnSingleTask
  3. in Task builder : add option .saveResponseHeader(Set keys, boolean getAll) ; default as to not to save the header
  4. Corresponding tests to maintain the coverage etc.

etc.

jeffpeiyt commented 8 years ago

@brunoribeiro One concern is that saving all the headers will take quite some spaces in memory. Are you looking for a specific pair in the header or want to check all of them?

brunoribeiro commented 8 years ago

@jeffpeiyt the one I am looking is specific to my server implementation, maybe we could pass a list of header keys to be returned if present, but this will surely make it slower as it need to check all them to find the matches.

jeffpeiyt commented 8 years ago

@brunoribeiro yes. I am thinking the same way. As long as the needed keys are provided, it is O(K) to get it out where K is the number of needed keys. Most time we may be just interested in 1 or 2 keys in the headers. (K is very small) So I wound not worry much. Also by default we do not fetch this; and this is done in parallel (http worker).

jeffpeiyt commented 8 years ago

Done features and basic test .saveResponseHeaders(new ResponseHeaderMeta(null, true))


    @Test
    public void hitWebsitesMinSyncWithAllResponses() {

        Map<String, Object> responseContext = new HashMap<String, Object>();
        pc
                .prepareHttpGet("/validateInternals.html")
                .setConcurrency(1700)
                .handleInWorker()
                .saveResponseHeaders(new ResponseHeaderMeta(null, true))
                .setTargetHostsFromString(
                        "www.parallec.io www.jeffpei.com www.restcommander.com")
                .execute(new ParallecResponseHandler() {

                    @Override
                    public void onCompleted(ResponseOnSingleTask res,
                            Map<String, Object> responseContext) {

                        Map<String, List<String>> responseHeaders = res.getResponseHeaders();

                        for(Entry<String, List<String>> entry: responseHeaders.entrySet()){

                            logger.info("response header: {} - {}", entry.getKey(), entry.getValue());
                        }
                        responseContext.put(res.getHost(), responseHeaders.size());
                        logger.debug(res.toString());

                    }
                });

        for (Object o : responseContext.values()) {
            int headerKeySize = Integer.parseInt((String) o);
            Asserts.check(headerKeySize > 0,
                    " Fail to extract http header");
        }
        //logger.info("Task Pretty Print: \n{}", task.prettyPrintInfo());
    }
14:01:33.525 [main] INFO  i.p.c.ParallelTaskBuilder - Executing task PT_3_20160929140133524_44c749f0-058 in SYNC mode...  
14:01:33.525 [Thread-1] INFO  i.p.c.t.ParallelTaskManager - Added task PT_3_20160929140133524_44c749f0-058 to the running inprogress map...
14:01:33.527 [Thread-1] INFO  i.p.c.t.ParallelTaskManager - !!STARTED sendAgentCommandToManager : PT_3_20160929140133524_44c749f0-058 at 2016-09-29 14:01:33.527-0700
14:01:33.531 [ParallecActorSystem-akka.actor.default-dispatcher-3] INFO  i.p.c.a.ExecutionManager - parallec task state : IN_PROGRESS
14:01:33.531 [ParallecActorSystem-akka.actor.default-dispatcher-3] INFO  i.p.c.a.ExecutionManager - Before Safety Check: total entry count: 3
14:01:33.531 [ParallecActorSystem-akka.actor.default-dispatcher-3] INFO  i.p.c.a.ExecutionManager - After Safety Check: total entry count in nodeDataMapValidSafe: 3
14:01:33.531 [ParallecActorSystem-akka.actor.default-dispatcher-3] INFO  i.p.c.a.ExecutionManager - !Obtain command request for target host meta id THM_3_20160929140133523_d3583d40-9dd  with count: 3
14:01:33.544 [ParallecActorSystem-akka.actor.default-dispatcher-4] INFO  i.p.c.a.AssistantExecutionManager - Now finished sending all needed messages. Done job of ASST Manager at 2016.09.29.14.01.33.543-0700
14:01:33.714 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: x-cache - [HIT]
14:01:33.714 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: server - [GitHub.com]
14:01:33.714 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: access-control-allow-origin - [*]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: connection - [keep-alive]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: last-modified - [Fri, 23 Sep 2016 23:56:04 GMT]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: date - [Thu, 29 Sep 2016 21:01:33 GMT]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: via - [1.1 varnish]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: accept-ranges - [bytes]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: cache-control - [max-age=600]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: x-served-by - [cache-den6024-DEN]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: vary - [Accept-Encoding]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: expires - [Thu, 29 Sep 2016 20:52:11 GMT]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: content-length - [620]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: x-cache-hits - [1]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: x-fastly-request-id - [489951baad8349dd30ea8707d4eaa5e10992bec1]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: age - [559]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: x-github-request-id - [C71B4E17:332B:9A3F9A0:57ED7CA2]
14:01:33.715 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: content-type - [text/html; charset=utf-8]
14:01:33.718 [ParallecActorSystem-akka.actor.default-dispatcher-6] INFO  i.p.c.a.ExecutionManager - 
[1]__RESP_RECV_IN_MGR 1 (+2) / 3 (33.333%)  AFT 0.187 S @ www.parallec.io @ 2016.09.29.14.01.33.717-0700 , TaskID : 44c749f0-058 , CODE: 200 OK, RESP_BRIEF: <!DOCTYPE html>
<html><head><met 
14:01:33.830 [ParallecActorSystem-akka.actor.default-dispatcher-6] INFO  i.p.c.TestBase - response header: server - [AmazonS3]
14:01:33.830 [ParallecActorSystem-akka.actor.default-dispatcher-6] INFO  i.p.c.TestBase - response header: etag - ["7cc07a9153ea2e01e915fcfee0c921ba"]
14:01:33.830 [ParallecActorSystem-akka.actor.default-dispatcher-6] INFO  i.p.c.TestBase - response header: last-modified - [Wed, 09 Apr 2014 06:34:54 GMT]
14:01:33.830 [ParallecActorSystem-akka.actor.default-dispatcher-6] INFO  i.p.c.TestBase - response header: x-amz-request-id - [873E27207EE5B946]
14:01:33.830 [ParallecActorSystem-akka.actor.default-dispatcher-6] INFO  i.p.c.TestBase - response header: content-length - [594]
14:01:33.831 [ParallecActorSystem-akka.actor.default-dispatcher-6] INFO  i.p.c.TestBase - response header: x-amz-id-2 - [lYm0wcoroScpmYgnbl3+85knVVigMkHhzCzcCeVXabw4NJPJhBcGnLNUBlPzstGSMzGeqbopVkg=]
14:01:33.831 [ParallecActorSystem-akka.actor.default-dispatcher-6] INFO  i.p.c.TestBase - response header: date - [Thu, 29 Sep 2016 21:01:34 GMT]
14:01:33.831 [ParallecActorSystem-akka.actor.default-dispatcher-6] INFO  i.p.c.TestBase - response header: content-type - [text/html]
14:01:33.832 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.a.ExecutionManager - 
[2]__RESP_RECV_IN_MGR 2 (+1) / 3 (66.667%)  AFT 0.301 S @ www.restcommander.com @ 2016.09.29.14.01.33.831-0700 , TaskID : 44c749f0-058 , CODE: 200 OK, RESP_BRIEF: <!DOCTYPE html>
<html>
<body>

14:01:34.891 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: server - [AmazonS3]
14:01:34.891 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: etag - ["538a4857516afb35dd416dacea1a4b1d"]
14:01:34.891 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: last-modified - [Wed, 09 Apr 2014 06:37:52 GMT]
14:01:34.891 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: x-amz-request-id - [6384E719DE34F682]
14:01:34.891 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: content-length - [725]
14:01:34.891 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: x-amz-id-2 - [KdWONL7fZ8u1yHtx10d47sO3VendT5RSy9XrlslEGcsyESlZDUcL29+N32u1fKgHVyfYTS2qdaA=]
14:01:34.892 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: date - [Thu, 29 Sep 2016 21:01:34 GMT]
14:01:34.892 [ParallecActorSystem-akka.actor.default-dispatcher-2] INFO  i.p.c.TestBase - response header: content-type - [text/html]
14:01:34.892 [ParallecActorSystem-akka.actor.default-dispatcher-7] INFO  i.p.c.a.ExecutionManager - 
[3]__RESP_RECV_IN_MGR 3 (+0) / 3 (100.00%)  AFT 1.362 S @ www.jeffpei.com @ 2016.09.29.14.01.34.892-0700 , TaskID : 44c749f0-058 , CODE: 200 OK, RESP_BRIEF: <!DOCTYPE html>
<!-- saved from  
14:01:34.893 [ParallecActorSystem-akka.actor.default-dispatcher-7] INFO  i.p.c.a.ExecutionManager - task.state : COMPLETED_WITHOUT_ERROR
14:01:34.893 [ParallecActorSystem-akka.actor.default-dispatcher-7] INFO  i.p.c.a.ExecutionManager - task.totalJobNumActual : 3 InitCount: 3
14:01:34.893 [ParallecActorSystem-akka.actor.default-dispatcher-7] INFO  i.p.c.a.ExecutionManager - task.response received Num 3 
14:01:34.893 [ParallecActorSystem-akka.actor.default-dispatcher-7] INFO  i.p.c.a.ExecutionManager - SUCCESSFUL GOT ON ALL RESPONSES: Received all the expected messages. Count matches: 3 at time: 2016.09.29.14.01.34.893-0700
14:01:34.895 [ParallecActorSystem-akka.actor.default-dispatcher-7] INFO  i.p.c.a.ExecutionManager - 
Time taken to get all responses back : 1.365 secs
jeffpeiyt commented 8 years ago

@brunoribeiro ok, a very simple way for the case insensitiveness is to save with all with lower cases. The logs have been updated as above. Are you good with this?

responseHeaders.put(key.toLowerCase(Locale.ROOT),
                                    response.getHeaders().get(key));
brunoribeiro commented 8 years ago

perfect, thank you so much.

On Thu, Sep 29, 2016 at 10:03 PM, Yuanteng (Jeff) Pei < notifications@github.com> wrote:

@brunoribeiro https://github.com/brunoribeiro ok, a very simple way for the case insensitiveness is to save with all with lower cases. The logs have been updated as above. Are you good with this?

responseHeaders.put(key.toLowerCase(Locale.ROOT), response.getHeaders().get(key));

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/eBay/parallec/issues/24#issuecomment-250590715, or mute the thread https://github.com/notifications/unsubscribe-auth/ADAwe0xDBmbBgzQNtNp9XTDKA828eirOks5qvCesgaJpZM4G2sa- .

Bruno Ribeiro

jeffpeiyt commented 8 years ago

@brunoribeiro thank you for the confirmation. Will add more tests cases / docs and update here after a release

jeffpeiyt commented 8 years ago

@brunoribeiro released in version 0.10.1-beta; please let me know for any more questions.

example: https://github.com/eBay/parallec/blob/master/src/test/java/io/parallec/core/main/http/ParallelClientHttpResponseHeaderTest.java