taoensso / faraday

Amazon DynamoDB client for Clojure
https://www.taoensso.com/faraday
Eclipse Public License 1.0
238 stars 84 forks source link

Count (and other return values) lost on call to query/scan #91

Closed ricardojmendez closed 4 years ago

ricardojmendez commented 8 years ago

While we do get an element count back from QueryResult, we are discarding this information on the call to merge-more, and query is returning only the list of items. We don't currently have a way to query and return all information.

Any thoughts on how to best handle this? It would be a major breaking change, so perhaps the best way is to have a separate query-raw that does not receive span-reqs and returns the value straight from as-map.

As a side note, it appears that merge-more also discards other information that users could ask for on query, such as the consumed capacity (there's even a documented parameter for that on query).

ptaoussanis commented 8 years ago

Hi Ricardo,

To clarify: where do you see a breaking change being necessary? Is the issue that you'd like to return both the items and count information simultaneously?

What if scan and query returned items as usual but with metadata attached that includes the additional info that's available (like the count)? I.e. just mod merge-more to add metadata merging (if it doesn't do this, can't recall); then possibly mod query/scan to fix the broken options like :return :count, etc.

Let me know if you'd like me to take a proper look and I'll try schedule some time this coming week.

Cheers :-)

ricardojmendez commented 8 years ago

Hi Peter,

Yes, I'm looking into getting both items and count, as well as any other data we get like the scanned item count. I hadn't considered the metadata approach, and you're right - that would solve the need for a breaking change.

I'm now wondering if it's worth making this change for my use case, though, since it seems like the functionality I need isn't really provided by query/scan at all. I'd like the limit the number of rows returned, and get a total count of rows, but :limit on DynamoDB acts slightly differently limiting only the number of indexes looked at.

QueryRequest/ScanRequest do not seem to have support for actually limiting the result size, which I've so far found on the newer document API. Without being able to limit the results, I'm not sure how much use there is to the count/scanned-count.

It would however fix the broken options.

ptaoussanis commented 8 years ago

QueryRequest/ScanRequest do not seem to have support for actually limiting the result size, which I've so far found on the newer document API.

Hi Ricardo, could you possibly clarify this line: you mean that QueryRequest and ScanRequest don't offer the facility you're looking for but some other new API exists which does?

Have spent zero time using (or keeping up with) DDB in a long while; is there a reason we couldn't / wouldn't want to support the new document API? Or am I misunderstanding?

ricardojmendez commented 8 years ago

Hey Peter,

Consider this a WIP/half-uninformed reply, as I've just started looking into the new API and need to run further tests.

Hi Ricardo, could you possibly clarify this line: you mean that QueryRequest and ScanRequest don't offer the facility you're looking for but some other new API exists which does?

The :limit option on QueryRequest and ScanRequest limits the number of indexes looked at. The :filter-expr is then applied over those.

Suppose we have 30 items on a table, and ask it to limit the scan to 10. We also send a filter expression that would eliminate the first five (because of non-index reasons). We'll only get five results back, since the scan only retrieved the first ten items and then filtered them.

From what I've read (and on some basic initial tests), the new ScanSpec/QuerySpec withMaxResultSize would return the expected matching 10 results. Need to test further, though, going to create an artificial case that reproduces it.

Have spent zero time using (or keeping up with) DDB in a long while; is there a reason we couldn't / wouldn't want to support the new document API? Or am I misunderstanding?

No particular reason. It's just not something we can easily tack on to the existing query/scan - they're different classes on different namespaces with different expectations.

ptaoussanis commented 8 years ago

Gotcha, thanks for the clarification. Will leave it to your best judgement re: if/when/how to add support for the ScanSpec/QuerySpec API.

All the best for the new year btw!

ricardojmendez commented 8 years ago

All the best for the new year btw!

Thanks, and likewise!

kipz commented 4 years ago

@ricardojmendez faraday is (now) returning metadata on the results from query/scan containing count, last-prim-kvs etc. Are we OK to close? That is not to say we shouldn't look at the Document API as per #103 of course, but I'm not sure we need it to get this data back right now unless I misunderstood the issue?

ricardojmendez commented 4 years ago

Hey @kipz - this issue is about 4 years old and I haven't used Faraday in a while, so feel free to de-prioritize or close. Cheers!

kipz commented 4 years ago

Thank you.

davinun commented 3 years ago

I see that the issue is closed and probably a long overdue, but scan actually does return the metadata. We had to hack into Farady code in order to find out that metadata is attached to the list of items, the scan method returns. In order to get it, one should do the following:

        new-response (db-lib/scan @client table opts)
        response-meta (meta new-response)
        last-prim-kvs (:last-prim-kvs response-meta)
        count (:count response-meta)
        scanned-count (:scanned-count response-meta)