Closed john-bodley closed 5 years ago
@john-bodley note that cursor.description
should be None
in this case:
This attribute will be None for operations that do not return rows or if the cursor has not had an operation invoked via the .execute*() method yet. (https://www.python.org/dev/peps/pep-0249/#description)
Maybe we're missing something in the SQLAlchemy dialect?
@john-bodley note too that .fetchall()
should return an exception when no rows are returned:
An Error (or subclass) exception is raised if the previous call to .execute*() did not produce any result set or no call was issued yet. (https://www.python.org/dev/peps/pep-0249/#fetchall)
So I think this is behaving as expected.
@betodealmeida the fetchall
isn’t throwing the exception, but instead the underlying cursor is None
because it’s prematurely closed.
@john-bodley but my point is, you should always wrap your .fetchall()
in a try/except
block, in case zero rows were returned.
@betodealmeida note's it's the ResultProxy fetchall
method that I'm calling which wraps the underlying DBAPI and has slightly different behavior (per the documentation).
Fetch all rows, just like DB-API cursor.fetchall().
After all rows have been exhausted, the underlying DBAPI cursor resource is released, and the object may be safely discarded.
Subsequent calls to ResultProxy.fetchall() will return an empty list. After the ResultProxy.close() method is called, the method will raise ResourceClosedError.
@betodealmeida I created https://github.com/sqlalchemy/sqlalchemy/issues/4523 to discuss how one should handle an empty result set associated with a SELECT statement when no description is available.
It seems pydruid
should provide a description even if no rows are returned, the issue is it may be non-trivial to determine the column names to add to the description given the lack of metadata provided by the Druid SQL REST API.
@betodealmeida it seems that https://github.com/apache/incubator-druid/pull/6191 will help remedy the solution as the SQL interface provides a mechanism for returning the column names via the X-Druid-Column-Names
header which will be included in version 0.13.0
(currently incubating).
@john-bodley : I am also impacted by this issue. I am currently using Druid 0.13.0. But still I am not receiving X-Druid-Column-Names
header in my response. Any druid config file need to be updated for this ?
As a part of this bug (https://github.com/apache/incubator-druid/issues/6409,) , it looks like they moved the column info to response body which can be seen by setting "header" : true
in the request, which will never return empty response.
The below query will return 1 row with all columns Query
{
"query":"SELECT delta, page, added, deleted FROM wikipedia WHERE 1=0 ",
"resultFormat" : "object",
"header" : true
}
Result
[
{
"delta": null,
"page": null,
"added": null,
"deleted": null
}
]
More: Section "Response" under http://druid.io/docs/latest/querying/sql.html
@Makesh-Gmak I can look into providing a fix for this.
@betodealmeida what are your thoughts on how to best proceed? It seems likes including the header
in the request for older versions doesn't break things, however one cannot differentiate between the responses whether the header is present, i.e., it merely inserts a new record at the beginning of the result set, .e.g.,
[
{
"delta": null,
"page": null,
"added": null,
"deleted": null
},
{
"delta": 36,
"page": "Talk:Oswald Tilghman",
"added": 36,
"deleted": 0
}
]
Checking whether the first records has all NULLs as values isn't a robust method as it's definitely plausible to have a result set which includes NULLs. I was thinking the only solutions include:
/status
API call to determine the version of the Druid broker prior to executing each query.Note (2) is more performant than (1) however it requires additional metadata.
Thanks for the resolution. When will be the next release ?
SQLAlchemy throws an
AttributeError
when fetching from a ResultProxy when the underlying Druid query results in an empty result set, e.g. no rows, as the proxy is prematurely closed, i.e.,The reason the ResultProxy is closed is because
cursor.description
isNone
. This results inresult.cursor
beingNone
which is why the exception is thrown.It seems that other dialects have a description irrespective of whether there are rows and hence don't suffer from the same issue. Note I'm uncertain whether there's even a viable solution as the Druid SQL REST API doesn't provide this information for an empty result set,
Note I know one can circumvent this issue by using a raw connection, however this example illustrates the behavior that Pandas uses for reading SQL.
to: @betodealmeida @mistercrunch