lwhay / asterixdb

Automatically exported from code.google.com/p/asterixdb
0 stars 0 forks source link

Long query results in HTTP 413 Full Head when passing it through rest APIs. #775

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
I think this should be an enhancement instead of a bug but don't know how to 
assign it to that category in google code.

What steps will reproduce the problem?
1. URLencode the attached AQL query file and pass the resulting string to the 
/query endpoint of AsterixDB's Restful APIs.

What is the expected output? What do you see instead?

The result should be something like:
{ "stateIndex": 1, "count": 1i64, "avgSemanticScores": 1.0d }
{ "stateIndex": 3, "count": 5i64, "avgSemanticScores": 2.0d }
......

But I got HTTP 413 "Full Header" error because the query string is too long to 
fit into HTTP header.

What version of the product are you using? On what operating system?

AsterixDB built from "cs295/master_tweesearch" branch on Ubuntu 12.04 LTS.

Original issue reported on code.google.com by hencr...@gmail.com on 21 May 2014 at 8:05

Attachments:

GoogleCodeExporter commented 9 years ago
Try using a POST method instead of GET. POST will allow you to send the query 
as the body of the message, which can have an arbitrary length.

Original comment by zheilb...@gmail.com on 21 May 2014 at 8:07

GoogleCodeExporter commented 9 years ago
I think in the email in users it was said POST was tried (and it seems like 
using that method instead of GET is one of the few ways to remedy something 
like this...). I'll mark this as an enhancement too while I'm at it. 

Original comment by ima...@uci.edu on 21 May 2014 at 8:12

GoogleCodeExporter commented 9 years ago
After re-reading through my code. I noticed there might be some errors when I 
tried to use HTTP POST last time. I will work on that again and let you know 
the result.

Original comment by hencr...@gmail.com on 21 May 2014 at 8:15

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Oh, I missed the email. Yes, AsterixDB does support POST. Let us know how it 
goes.

Original comment by zheilb...@gmail.com on 21 May 2014 at 8:26

GoogleCodeExporter commented 9 years ago
I've finished testing. It seems like with the same "query=urlencodedQuery" 
payload, sending it through POST will result in Server 500 error while sending 
it through GET works. Could you tell whether there are other ways to solve this 
issue because I'm not really sure what email you're referring to? Thanks.

Original comment by hencr...@gmail.com on 21 May 2014 at 8:34

GoogleCodeExporter commented 9 years ago
Try sending the query as the entity 
(http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.1) of the request 
instead of URL encoding it in the URL, using the POST method.

This *should* fix your problem, but let me know if it doesn't.

Original comment by zheilb...@gmail.com on 21 May 2014 at 9:46

GoogleCodeExporter commented 9 years ago
Related question: In HTTP, data can be send as an entity of a request OR by 
encoding as part of the URL. is there a convention and/or relation between HTTP 
methods and how data is sent?

I had previously mistaken PUT/POST to accept data as entities while GET would 
accept data as URL encoded parameters. (And, looking at some other REST APIs, 
this seems to be a common convention). But, after reading the RFC, there's no 
specific mention of entities vs parameters, which leads me to believe it should 
be defined by the designers of the API.

Thoughts, anyone?

Original comment by zheilb...@gmail.com on 21 May 2014 at 10:32

GoogleCodeExporter commented 9 years ago
PS: check out this tool for testing your REST calls: http://www.getpostman.com/

Original comment by zheilb...@gmail.com on 21 May 2014 at 10:43

GoogleCodeExporter commented 9 years ago
What should I set the content-type for POST request to be? What kind of data 
does AsterixDB accepts?

Based on my experiments, when I use:
(1)
POST /query HTTP/1.1
Host: 127.0.0.1:19002
Accept: application/json; charset=utf-8
Cache-Control: no-cache

----WebKitFormBoundaryE19zNvXGzXaLvS5C
Content-Disposition: form-data; name="query"

use dataverse feeds for $u in dataset usersDataSet return $u
----WebKitFormBoundaryE19zNvXGzXaLvS5C
Content-Disposition: form-data; name="mode"

synchronous
----WebKitFormBoundaryE19zNvXGzXaLvS5C

Or (2)
POST /query HTTP/1.1
Host: 127.0.0.1:19002
Accept: application/json; charset=utf-8
Cache-Control: no-cache
Content-Type: application/x-www-form-urlencoded

query=use%2520dataverse%2520feeds%253B%250Afor%2520%2524u%2520in%2520dataset%252
0usersDataSet%250Areturn%2520%2524u.termsTracked&mode=synchronous

I got server 500 with this error message, and even after discarding the "mode" 
parameter, I still got this:
{
    "error-code": [
        2,
        "SyntaxError:Encountered \" \"-\" \"- \"\" at line 1, column 2.\nWas expecting one of:\n    \"dataset\" ...\n    \"(\" ...\n    \"[\" ...\n    \"{\" ...\n    \"{{\" ...\n    <INTEGER_LITERAL> ...\n    \"null\" ...\n    \"true\" ...\n    \"false\" ...\n    <DOUBLE_LITERAL> ...\n    <FLOAT_LITERAL> ...\n    <STRING_LITERAL> ...\n    <VARIABLE> ...\n    <STRING_LITERAL> ...\n    \n==> ------WebKitFormBoundaryaJWOH3bmDlMoNL7b\r"
    ]
}

(3) However, when I use raw text:
POST /query HTTP/1.1
Host: 127.0.0.1:19002
Accept: application/json; charset=utf-8
Cache-Control: no-cache

use dataverse feeds; for $u in dataset usersDataSet return $u.termsTracked

This works but I can no longer specify the "mode" parameter and the returned 
data is not JSON but:

<h4>Results:</h4>
<pre>
{{ { "term": "GoogleGlass", "dateAdded": datetime("2014-04-01T10:10:35.000Z"), 
"belongsToWhichProductCategories": {{ "Electronics", "Clothing_Accessories" }} 
}, { "term": "IPad", "dateAdded": datetime("2014-04-06T10:10:35.000Z"), 
"belongsToWhichProductCategories": {{ "Electronics" }} }, { "term": "Microsoft 
Office", "dateAdded": datetime("2014-04-10T10:10:35.000Z"), 
"belongsToWhichProductCategories": {{ "Software" }} } }}
</pre>

Original comment by hencr...@gmail.com on 21 May 2014 at 11:42

GoogleCodeExporter commented 9 years ago
There's currently a bug in our system: set the content-type to application/json 
if you want JSON results instead of HTML.

To change the mode, send your post with the mode parameter set:
...query?mode=asynchronous

Let me know if that works.

Original comment by zheilb...@gmail.com on 24 May 2014 at 8:45

GoogleCodeExporter commented 9 years ago
PS: The bug in question is issue #763, discovered by you! :)

Original comment by zheilb...@gmail.com on 24 May 2014 at 8:50