apache / cassandra-gocql-driver

GoCQL Driver for Apache Cassandra®
https://cassandra.apache.org/
Apache License 2.0
2.58k stars 622 forks source link

CASSGO-30 DSE Search Results May Not Be Queryable #1689

Open cardonator opened 1 year ago

cardonator commented 1 year ago

What version of Cassandra are you using?

DataStax Enterprise 5.1 / Cassandra 3.11

What version of Gocql are you using?

1.3.2

What version of Go are you using?

1.20.2

What did you do?

I am using the Search product in DSE to run solr queries for BI purposes. gocql works great for this the vast majority of the time because the queries return results that look just like the table schema, however there is a style of solr query with a pivot/facet where the response fields can differ from the table fields and gocql does not seem to be able to handle this while other Cassandra connectors can.

What did you expect to see?

For example, if I run a query like this:

select * from keyspace.mytable where solr_query = '{"q": "*:*", "fq":["account_id:(1 2 3 4)", "created:[2020-01-01T00:00:00Z TO 2020-12-31T23:59:59Z]"], "facet":{"pivot":"profile_name", "range":"created", "f.created.range.start":"2020-01-01T00:00:00Z", "f.created.range.end":"2020-12-31T23:59:59Z", "f.created.range.gap":"+1MONTH"}}';

If I run this from cqlsh or from a tool we use called SQLPad, I get back two columns in the response, facet_ranges and facet_pivot. These are json fields that have various faceted statistics in them. I tried using the dynamic fields functionality but even those weren't able to see the data returned from these fields in any way that I could parse.

What did you see instead?

The data is not returned in the gocql files in any way I can find it.

What I'm looking for is if there is something I should try, some kind of debug output I should enable, or anything else I could do that would help get support for this type of response working in gocql. I realize the maintainers don't really have access to or use DSE, but it would be very helpful for BI use cases.

martin-sucha commented 1 year ago

Hi!

How does your code that calls gocql look like? What do you see in the results with gocql instead of the expected result? "The data is not returned" is not very specific. I need more information to be able to help you.

Gocql can be built with gocql_debug tag for more verbose logs, however it's possible that will not provide enough information. Can you provide packet capture showing this query execution with gocql (if the results of the query are not sensitive)? Does framer.parseResultRows see the columns in the result metadata? You could also try running a reproducer program under the debugger.

remiphilippe commented 1 year ago

From what I see the problem is linked to the "direct" JSON response that DSE returns. The reponse doesn't match any colums.

We get a crash when we run solr queries (panic is the same as https://github.com/gocql/gocql/issues/1460):

panic: not enough bytes in buffer to read int require 4 got: 0 [recovered]
    panic: not enough bytes in buffer to read int require 4 got: 0

I've attached the output of the debugger

image

The full buffer is:

"\x00\x00\x00\x82{\"accesslocation\":{\"gridLevel\":2,\"columns\":32,\"rows\":32,\"minX\":-180.0,\"maxX\":180.0,\"minY\":-90.0,\"maxY\":90.0,\"counts_ints2D\":null}}"

if there is a way to get the response from that buffer, that would work.

remiphilippe commented 1 year ago

A little more context, the first column readInt works and the buffer contains the value, it's the second column read that causes the crash (cols shows 23 columns total which are indeed the 23 columns from the table, but the solr query returns a single column).

A workaround that works for us, specify any column of the text data type (we use the solr_query one select "solr_query" from ...) and the query will succeed and be processed by gocql.

martin-sucha commented 1 year ago

Thanks! @remiphilippe could you please provide the the Go code that calls gocql in the reproducer? Having a dump of the whole PREPARE/EXECUTE exchange from the protocol would help as well, as it is not clear to me how the result metadata looks like in this case and I don't have access to DSE to try myself.

In any case it seems that the error handling in frame parsing should be more robust, it definitely should not panic, just return an error. However, we also need to find out why gocql expects int while there is JSON blob.

cc @joao-r-reis is this issue something that Datastax would be willing to help with?

remiphilippe commented 1 year ago

@martin-sucha sure can do that, are you looking for outputs from delve or specific dumps? If you let me know what to dump I'll update my test code.

joao-r-reis commented 1 year ago

We can't look into this at this exact moment but I'm going to add this to my list and see if I can find some free time in the near future to investigate and potentially fix the issue.

martin-sucha commented 1 year ago

@remiphilippe pcap dump from Wireshark/tcpdump with the network communication would help. Please make sure there isn't any secret included, like the database password or sensitive contents of tables.

Specifically, gocql will send a PREPARE message to the server when first executing the query and then EXECUTE message to execute the prepared statement. The RESULT of the prepare message will contain description of the columns and the RESULT from the execute call will contain the values.

With the captured network packets, it should be possible to:

cardonator commented 3 months ago

Now that Apache owns this library, and DataStax is a major contributor to the open source project, is there anyone at DataStax that could help with this issue?

lukasz-antoniak commented 6 days ago

Background GoCQL will prepare the statement and reuse result metadata for RESULT message parsing. When we prepare CQL like: SELECT * FROM my_table WHERE solr_query='{"q": "id:*", "facet":{"pivot":"description", "limit":"10"}, "useFieldCache":true}' C* would return as many columns as my_table has for PREPARE request (e.g. 3). When we execute SOLR query, there is only 1 column return and this causes issue.

Workaround When executing query use .NoSkipMetadata().

Proposed Solution What do you think about adding ScanSolr similar to ScanCAS that could handle result metadata correctly? I have mixed feelings the reason is strong enough to have new scan function.

joao-r-reis commented 6 days ago

Thanks @lukasz-antoniak for the investigation. Since this is related to gocql not prioritizing the metadata of the response over the prepared metadata this will be fixed by #1822