Open varnit opened 9 years ago
Hi Varnit
Unfortunately no, but is easy enough to do. Will look into it this weekend as I will be refactoring ibrowse a bit to integrate other pull requests.
W: http://chandrusoft.wordpress.com
On 17 Dec 2014, at 23:09, Varnit notifications@github.com wrote:
Hi, Is it possible to set the max sessions for a domain and have it work across all its subdomains? For example if I set the following:
ibrowse:set_max_sessions("hotmail.com", 443, 100) I would want a maximum of 100 connections for hotmail.com and all its subdomains (m.hotmail.com, bay01.hotmaill.com etc)
Is this possible today?
— Reply to this email directly or view it on GitHub.
OK, thanks! Let me know if you need help with anything.
@cmullaparthi I assume this has not been done in that weekend?
I'm afraid not :-) I take it this is important for you?
@cmullaparthi Kind of important, forgive my poor English, let me tell a story.
I got a bunch of urls from my boss like this:
http://www.example0.com/foo/bar http://test.example0.com/foo/bar http://foo.example0.com/foo/bar http://bar.example0.com/foo/bar http://www.example1.com/foo/bar http://test.example2.com/foo/bar http://foo.example3.com/foo/bar http://bar.example1.com/foo/bar ...
Then I got a configuration file from my boss like this:
example0.com --> concurrent: 1 bar.example1.com --> concurrent: 2 bar.example2.com --> concurrent: 3
Then when I request the urls above, I need to limit their concurrency by the configuration above.
And the configuration file, in my boss's opinion:
example0.com
ofcouse means *.example0.com
and example.com
.
And I can't tell my boss that ibrowse does not have that kind of configuration, so I have to handle this in my application.
And the other thing is that, the urls my boss give me, is dynamic changing. So I can't tell my boss:"Give me all your urls, and let me generate a appropriate configuration file for you.", I think my boss will reply:"No, programmer, I won't, I'll add url to the list whenever I want, this is easy, handle it".
So, when the my program has been start running, my boss may come to my desk and give me another url, say:"Add it to the list", then I will do as my boss just said.
For now, here is my solution:
http://test.example0.com/foo/bar
, need to be handledtest.example0.com
test.example0.com
within the configuration file, use ends_with
example0.com --> concurrent: 1
:ibrowse.set_max_sessions("test.example0.com", 80, 1)
Then if I got any url like:
http://test.example0.com/foo/bar1 http://test.example0.com/foo/bar2 http://test.example0.com/foo/bar3 http://test.example0.com/foo/bar4
the steps above will be processed again, cause I am so lazy and I didn't write code to store the configurations and then check if the domain is configurated.
Well, end of story.
I not quite sure if it is the right solution, but it seems working.
BUT: I would love to remove the code I have wrote to match subdomains immediately, if ibrowse have this feature.
I loved this story :-)
There are a couple of complications with this:
Are you happy with both these limitations? If so I will go ahead and implement it.
@cmullaparthi Thanks for your reply ~
One or more of your subdomains may be unreachable because there are lots of requests to another subdomain
- If the
unreachable
is caused because of the server bandwidth or capability, then it's fine. Since we limit the max_session on the root domain for a reason.- If the
unreachable
is caused because of theretry_later
message fromibrowse
, then it is also reasonable, it is exactly what we want.Load balancing will be a more expensive operation because it has to make sure that the limit is enforced while routing requests correctly to each subdomain.
Expensive is a relative word.
Yesterday I refactored my code for better limitation feature, I use poolboy to set a ibrowse pool for every root domain, every time when I get a url, I check if the pool of the root domain of this url exists, if it exists, use the pool, otherwise create a new pool for this root domain.
If what you are going to implement is not more expensive than my approach, I think it worth a try.
Thank you.
Okay, good. No, the solution will be cheaper than using an external pooling mechanism. I'll create a branch with the proposed changes so you can try.
@cmullaparthi Thanks, you are so nice!
I've pushed some changes to the issue_124 branch. See 3fc7e78aad6ab4b882da4268d17871d1fbc1cc5f
Usage:
$ erl -pa ebin
Erlang/OTP 18 [erts-7.3] [source] [64-bit] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V7.3 (abort with ^G)
1> application:ensure_all_started(ibrowse).
{ok,[ibrowse]}
2>
f(),
ibrowse:set_max_sessions("google.com", 80, 1), %% Set the LB config for the root domain
Res_1 = ibrowse:send_req("http://www.google.com", [], get, [],
[{use_subdomain_lb_config, {"google.com", 80}}]), %% New option
io:format("Res_1: ~p~n", [Res_1]),
ibrowse:show_dest_status(),
Res_2 = ibrowse:send_req("http://m.google.com", [], get, [],
[{use_subdomain_lb_config, {"google.com", 80}}]), %% New option
io:format("Res_2: ~p~n", [Res_2]),
ibrowse:show_dest_status().
Res_1: {ok,"302",
[{"Cache-Control","private"},
{"Content-Type","text/html; charset=UTF-8"},
{"Location",
"http://www.google.co.uk/?gfe_rd=cr&ei=GBZpV-W9IYHS8AeEya-oAg"},
{"Content-Length","261"},
{"Date","Tue, 21 Jun 2016 10:25:28 GMT"}],
"<HTML><HEAD><meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<TITLE>302 Moved</TITLE></HEAD><BODY>\n<H1>302 Moved</H1>\nThe document has moved\n<A HREF=\"http://www.google.co.uk/?gfe_rd=cr&ei=GBZpV-W9IYHS8AeEya-oAg\">here</A>.\r\n</BODY></HTML>\r\n"}
Server:port | ETS | Num conns | LB Pid
================================================================================
www.google.com:80 | 20500 | 1 | <0.41.0>
google.com:80 | 16403 | 0 | <0.41.0>
Res_2: {error,retry_later}
Server:port | ETS | Num conns | LB Pid
================================================================================
www.google.com:80 | 20500 | 1 | <0.41.0>
google.com:80 | 16403 | 0 | <0.41.0>
m.google.com:80 | 32791 | 0 | <0.41.0>
The same test succeeds if you set max_sessions to 2.
$ erl -pa ebin
Erlang/OTP 18 [erts-7.3] [source] [64-bit] [async-threads:10] [hipe] [kernel-poll:false]
Eshell V7.3 (abort with ^G)
1> application:ensure_all_started(ibrowse).
{ok,[ibrowse]}
2>
f(),
ibrowse:set_max_sessions("google.com", 80, 2),
Res_1 = ibrowse:send_req("http://www.google.com", [], get, [],
[{use_subdomain_lb_config, {"google.com", 80}}]), %% New option
io:format("Res_1: ~p~n", [Res_1]),
ibrowse:show_dest_status(),
Res_2 = ibrowse:send_req("http://m.google.com", [], get, [],
[{use_subdomain_lb_config, {"google.com", 80}}]), %% New option
io:format("Res_2: ~p~n", [Res_2]),
ibrowse:show_dest_status().
Res_1: {ok,"302",
[{"Cache-Control","private"},
{"Content-Type","text/html; charset=UTF-8"},
{"Location",
"http://www.google.co.uk/?gfe_rd=cr&ei=dBlpV-mXDpPS8AfI1IFY"},
{"Content-Length","259"},
{"Date","Tue, 21 Jun 2016 10:39:48 GMT"}],
"<HTML><HEAD><meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<TITLE>302 Moved</TITLE></HEAD><BODY>\n<H1>302 Moved</H1>\nThe document has moved\n<A HREF=\"http://www.google.co.uk/?gfe_rd=cr&ei=dBlpV-mXDpPS8AfI1IFY\">here</A>.\r\n</BODY></HTML>\r\n"}
Server:port | ETS | Num conns | LB Pid
================================================================================
www.google.com:80 | 20500 | 1 | <0.41.0>
google.com:80 | 16403 | 0 | <0.41.0>
Res_2: {ok,"302",
[{"Location","http://www.google.com/mobile/other/"},
{"Cache-Control","private"},
{"Content-Type","text/html; charset=UTF-8"},
{"X-Content-Type-Options","nosniff"},
{"Date","Tue, 21 Jun 2016 10:39:48 GMT"},
{"Server","sffe"},
{"Content-Length","232"},
{"X-XSS-Protection","1; mode=block"}],
"<HTML><HEAD><meta http-equiv=\"content-type\" content=\"text/html;charset=utf-8\">\n<TITLE>302 Moved</TITLE></HEAD><BODY>\n<H1>302 Moved</H1>\nThe document has moved\n<A HREF=\"http://www.google.com/mobile/other/\">here</A>.\r\n</BODY></HTML>\r\n"}
Server:port | ETS | Num conns | LB Pid
================================================================================
www.google.com:80 | 20500 | 1 | <0.41.0>
google.com:80 | 16403 | 0 | <0.41.0>
m.google.com:80 | 32791 | 1 | <0.41.0>
@cmullaparthi Awesome! Trying...
When I use this feature, it seems... well, a little tricky?
"example.com" -> 2
http://test.example.com
http://test.example.com
, which is example.com
ibrowse:send_req("http://test.example.com", [], get, [],
[{use_subdomain_lb_config, {"example.com", 80}}])
Suddenly I realized something, my boss said:"The server example.com is weak, we won't send more than 2 requests at the same time".
When my boss was saying this, the meaning seems include: "I don't know what the port mean, and I don't care what the 443 or 80 or even 8080 mean, they are just webpages, go get them, less than 2 requests at the same time".
At this time, I think maybe it's better to accomplish these demands in my application, instead of ibrowse, what do you think? @cmullaparthi
Yeah, it's not particularly elegant. But I feel that is the nature of the problem. If you always know that you are going to always shape traffic by using the 1st level subdomain, your code, I suppose, could be simpler using this feature?
invoke_ibrowse(Url, Headers, Payload, Method, Options) ->
#url{host = Host, port = Port} = ibrowse_lib:parse_url(Url),
Host_tokens = string:tokens(Host, "."),
LB_shaping_domain = string:join(lists:nthtail(length(Host_tokens) - 2, Host_tokens, "."),
ibrowse:send_req(Url, Headers, Method, Payload, [{use_subdomain_lb_config, {LB_shaping_domain, Port}} | Options]).
I suppose the above is more bearable than having to maintain your own pooling mechanism?
Hi, Is it possible to set the max sessions for a domain and have it work across all its subdomains? For example if I set the following:
I would want a maximum of 100 connections for hotmail.com and all its subdomains (m.hotmail.com, bay01.hotmaill.com etc)
Is this possible today?