Open cardonator opened 1 year ago
Hi!
How does your code that calls gocql look like? What do you see in the results with gocql instead of the expected result? "The data is not returned" is not very specific. I need more information to be able to help you.
Gocql can be built with gocql_debug tag for more verbose logs, however it's possible that will not provide enough information. Can you provide packet capture showing this query execution with gocql (if the results of the query are not sensitive)? Does framer.parseResultRows
see the columns in the result metadata? You could also try running a reproducer program under the debugger.
From what I see the problem is linked to the "direct" JSON response that DSE returns. The reponse doesn't match any colums.
We get a crash when we run solr queries (panic is the same as https://github.com/gocql/gocql/issues/1460):
panic: not enough bytes in buffer to read int require 4 got: 0 [recovered]
panic: not enough bytes in buffer to read int require 4 got: 0
I've attached the output of the debugger
The full buffer is:
"\x00\x00\x00\x82{\"accesslocation\":{\"gridLevel\":2,\"columns\":32,\"rows\":32,\"minX\":-180.0,\"maxX\":180.0,\"minY\":-90.0,\"maxY\":90.0,\"counts_ints2D\":null}}"
if there is a way to get the response from that buffer, that would work.
A little more context, the first column readInt works and the buffer contains the value, it's the second column read that causes the crash (cols
shows 23 columns total which are indeed the 23 columns from the table, but the solr query returns a single column).
A workaround that works for us, specify any column of the text data type (we use the solr_query one select "solr_query" from ...
) and the query will succeed and be processed by gocql.
Thanks! @remiphilippe could you please provide the the Go code that calls gocql in the reproducer? Having a dump of the whole PREPARE/EXECUTE exchange from the protocol would help as well, as it is not clear to me how the result metadata looks like in this case and I don't have access to DSE to try myself.
In any case it seems that the error handling in frame parsing should be more robust, it definitely should not panic, just return an error. However, we also need to find out why gocql expects int while there is JSON blob.
cc @joao-r-reis is this issue something that Datastax would be willing to help with?
@martin-sucha sure can do that, are you looking for outputs from delve or specific dumps? If you let me know what to dump I'll update my test code.
We can't look into this at this exact moment but I'm going to add this to my list and see if I can find some free time in the near future to investigate and potentially fix the issue.
@remiphilippe pcap dump from Wireshark/tcpdump with the network communication would help. Please make sure there isn't any secret included, like the database password or sensitive contents of tables.
Specifically, gocql will send a PREPARE message to the server when first executing the query and then EXECUTE message to execute the prepared statement. The RESULT of the prepare message will contain description of the columns and the RESULT from the execute call will contain the values.
With the captured network packets, it should be possible to:
Now that Apache owns this library, and DataStax is a major contributor to the open source project, is there anyone at DataStax that could help with this issue?
Background
GoCQL will prepare the statement and reuse result metadata for RESULT
message parsing. When we prepare CQL like:
SELECT * FROM my_table WHERE solr_query='{"q": "id:*", "facet":{"pivot":"description", "limit":"10"}, "useFieldCache":true}'
C* would return as many columns as my_table
has for PREPARE
request (e.g. 3). When we execute SOLR query, there is only 1 column return and this causes issue.
Workaround
When executing query use .NoSkipMetadata()
.
Proposed Solution
What do you think about adding ScanSolr
similar to ScanCAS
that could handle result metadata correctly? I have mixed feelings the reason is strong enough to have new scan function.
Thanks @lukasz-antoniak for the investigation. Since this is related to gocql not prioritizing the metadata of the response over the prepared metadata this will be fixed by #1822
What version of Cassandra are you using?
DataStax Enterprise 5.1 / Cassandra 3.11
What version of Gocql are you using?
1.3.2
What version of Go are you using?
1.20.2
What did you do?
I am using the Search product in DSE to run solr queries for BI purposes. gocql works great for this the vast majority of the time because the queries return results that look just like the table schema, however there is a style of solr query with a pivot/facet where the response fields can differ from the table fields and gocql does not seem to be able to handle this while other Cassandra connectors can.
What did you expect to see?
For example, if I run a query like this:
If I run this from cqlsh or from a tool we use called SQLPad, I get back two columns in the response,
facet_ranges
andfacet_pivot
. These are json fields that have various faceted statistics in them. I tried using the dynamic fields functionality but even those weren't able to see the data returned from these fields in any way that I could parse.What did you see instead?
The data is not returned in the gocql files in any way I can find it.
What I'm looking for is if there is something I should try, some kind of debug output I should enable, or anything else I could do that would help get support for this type of response working in gocql. I realize the maintainers don't really have access to or use DSE, but it would be very helpful for BI use cases.