bchavez / RethinkDb.Driver

:headphones: A NoSQL C#/.NET RethinkDB database driver with 100% ReQL API coverage.
http://rethinkdb.com/api/java
Other
384 stars 134 forks source link

RunGrouping with Count error #86

Closed stefanprodan closed 8 years ago

stefanprodan commented 8 years ago

Hello,

I'm having an issue trying to mix group and count in a data aggregation query.

I have a table named Token with the following structure:

{ 
"Expires": Wed Aug 17 2016 17:56:21 GMT+00:00 , 
"Issuer": "d04aad77448a" , 
"id": "0276701a-3834-4375-b0fa-5be0a691db39" 
} 

{ 
"Expires": Wed Aug 17 2016 17:56:17 GMT+00:00 , 
"Issuer": "f6adf92273de" , 
"id": "033325bf-c942-4fdf-af4c-0bb79f2fae3a" 
} 

{ 
"Expires": Wed Aug 17 2016 17:56:21 GMT+00:00 , 
"Issuer": "cf3d85de52b4" , 
"id": "038c0583-f92e-45c1-a885-1650daaf11e1" 
}

In ReQL I can group by Issuer and count tokens:

r.db('TokenStore').table('Token').group('Issuer').count()

or using Issuer secondary index

r.db('TokenStore').table('Token').group({index: 'Issuer'}).count()

Result:

{ 
"group": "cf3d85de52b4" , 
"reduction": 179 
} , 

{ 
"group": "d04aad77448a" , 
"reduction": 160 
} , 

{ 
"group": "f6adf92273de" , 
"reduction": 161 
} 

Running the same query from C#:

var list = R.Db(_dbName).Table(nameof(Token)).Group(nameof(Token.Issuer)).Count().RunGrouping<string, int>(conn);

Error result:

An exception of type 'System.MissingMethodException' occurred in System.Private.CoreLib.ni.dll but was not handled in user code

Additional information: Constructor on type 'RethinkDb.Driver.Model.GroupedResult`2[[System.String, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.Int32, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]' not found.

Running the query with index from C#:

var list = R.Db(_dbName).Table(nameof(Token)).Group(new { index = nameof(Token.Issuer) }).Count().RunGrouping<string, int>(conn);

Error result:

An exception of type 'RethinkDb.Driver.ReqlQueryLogicError' occurred in System.Private.CoreLib.ni.dll but was not handled in user code

Additional information: Expected type STRING but found OBJECT.

OS: Windows 10 Pro RethinkDB version: 2.3.4 Driver version: 2.3.10 .NET version: Core 1.0.0

bchavez commented 8 years ago

Hey @stefanprodan , thanks for reporting this. I'm asking the RethinkDB team for some clarification on reduction: value vs reduction: [list of v] response types. I should have a fix for your issue soon.

:collision: :dizzy: _"Crashing, hit a wall. Right now I need a miracle..."_

stefanprodan commented 8 years ago

I've tried RunGrouping<string, IEnumerable<int>> but the error is the same.

Thanks for looking into this.

bchavez commented 8 years ago

Yeah, nah, I don't think that will work. We're sort of expecting grouped results to be in a specific format:

{ group: ..., reduction: [array]}

So that this constructor can be called.

public GroupedResult(JToken key, JArray items)

But the R.query.Group("Issuer").Count() causes grouped responses to come back in:

{ group: ..., reduction: value}

Which turns out to be a JValue instead of a JArray (hence, I think, the System.MissingMethodException exception). So technically the correct response type is:

I did some digging around the Java driver and when Java is confronted with the same dilemma with R.query.Group("Issuer").Count() and gets reduction: value, it forces value to become List<int> of with one value item.

My question to the RethinkDB team


Hey deontologician:, danielmewes: I'm investigating https://github.com/bchavez/RethinkDb.Driver/issues/86. Basically, I've been expecting reduction fields to be grouped in the following reduction: [List of V] format:

// r.query.group("Issuer")
{ 
   "group": "cf3d85de52b4" , 
   "reduction": [{some object}, {some object}] // List of V
} , 
{ 
   "group": "d04aad77448a" , 
   "reduction": [{some object}, {some object} // List of V
} , ...

But it seems that if we tack on a .count() term to r.query.group("Field").count(), the reduction field gets transformed into an integer value with no array kind.

// r.query.group("Issuer")
{ 
   "group": "cf3d85de52b4" , 
   "reduction": 1 //Not a List of V
} , 
{ 
   "group": "d04aad77448a" , 
   "reduction": 2 //Not a List of V
} , ...

And looking at the Java model for GroupedResult.java in the Java driver we have:

//Correct model r.query.group("Field")
public class GroupedResult<G,V> {
    public final G group;
    public final List<V> values;
}

The above model seems like it's the same for all grouping kinds. List of V and V. So, when the Java driver is faced with the same dilemma of reduction: int (not a list), it "plops" V int into List<V> and GroupedResult.values has one item of type int and calls it a day. However, i think the correct representation of the response model for r.query.group("Field").count(), would be:

// Correct model for r.query.group("field").count()
// for grouped results with no List of V, but only V.
public class GroupedResult<G,V> {
    public final G group;
    public final V value;
}

I don't know which is the expected model we should be striving for. The model based on what kind of grouping response we get reduction: [] vs reduction: value. Or cast all reduction field value kinds to a native model of List of V even if the field happens to be of only V (it will be List of V with one item). :thinking_face: Should we just follow Java's lead here? Plop V into a List of V and call it a day? Does it seem safe? Hm...... kinda thinking we should just do what Java does.


:boom: :fire: _"Set it ablaze like a candle wick... Light it up, light it up..."_

bchavez commented 8 years ago

Hey @stefanprodan

Thanks for your patience. We got a new driver release with this fix. v2.3.12.

I spoke with Josh and I think we're just going to follow the Java driver for now in how we deal with this particular issue. It's the least breaking change and in case there are future unit tests, we'll be compatible with Java's grouping List<V>.

Your reduction value should be the only item in GroupedResult.Items : List<int>. Anytime we encounter this particular issue, where the reduction is a single value we're just going to lob the single value into List<T>.

Please feel free to re-open the issue if you encounter any other problems.

Thanks again, Brian

:car: :bluecar: [**"Let the good times roll..."**_](https://www.youtube.com/watch?v=7BDBzgHXf64)

stefanprodan commented 8 years ago

I've tested it and now it's working as expected.

The second issue still persists, regarding the group by index:

var list = R.Db(_dbName).Table(nameof(Token)).Group(new { index = nameof(Token.Issuer) }).Count().RunGrouping<string, long>(conn);

Error:

An exception of type 'RethinkDb.Driver.ReqlQueryLogicError' occurred in System.Private.CoreLib.ni.dll but was not handled in user code

Additional information: Expected type STRING but found OBJECT.
bchavez commented 8 years ago

Hmm. Interesting. Okay, I don't think I have a unit test for that group index query. Let me investigate.

:dash: :walking: _"Bubbles of gas in my brain... Send me off balance, it's not enough"_

bchavez commented 8 years ago

@stefanprodan , could you show me what your IndexCreate code looks like? Also, if you have an equivalent query in JS that works in Data Explorer, that would be helpful too.

stefanprodan commented 8 years ago

I've posted both queries in the issue... Sorry I've should've made a separate issue but I though it was the same cause.

Here it is:

r.db('TokenStore').table('Token').group({index: 'Issuer'}).count()

Result

{ 
"group": "cf3d85de52b4" , 
"reduction": 179 
} , 

{ 
"group": "d04aad77448a" , 
"reduction": 160 
} , 

{ 
"group": "f6adf92273de" , 
"reduction": 161 
}

I've created the index from code like this:

CreateIndex(_dbName, nameof(Token), nameof(Token.Issuer));
        protected void CreateIndex(string dbName, string tableName, string indexName)
        {
            var conn = _connectionFactory.CreateConnection();
            var exists =  R.Db(dbName).Table(tableName).IndexList().Contains(t => t == indexName).Run(conn);
            if (!exists)
            {
                R.Db(dbName).Table(tableName).IndexCreate(indexName).Run(conn);
                R.Db(dbName).Table(tableName).IndexWait(indexName).Run(conn);
            }
        }
bchavez commented 8 years ago

@stefanprodan okay, so some more investigation yeilds:

Your original query:

var list = R.Db(_dbName).Table(nameof(Token))
            .Group(new { index = nameof(Token.Issuer) })
            .Count()
            .RunGrouping<string, int>(conn);

The server sends us back a trace:

{"t":18,"e":3000000,"r":["Expected type STRING but found OBJECT."],"b":[0,1]}

Looking at the Java docs for group. Specifically, the example:

You can also group on an index (primary key or secondary).

Example: What is the maximum number of points scored by game type?

r.table("games").group().optArg("index", "type")
 .max("points").g("points").run(conn);

Additionally, the method signatures for group:

// *Note: no object, just string or function*
sequence.group([field | function...,]) → grouped_stream

An index for group is specified by optional argument, so, your C# query should be, like Java:

var list = R.Db(_dbName).Table(nameof(Token))
            .Group().OptArg("index", "Issuer")
            .Count()
            .RunGrouping<string, int>(conn);

or, alternatively, with more style :sunglasses:

var list = R.Db(_dbName).Table(nameof(Token))
            .Group()[new { index = "Issuer" }]
            .Count()
            .RunGrouping<string, int>(conn);

So, give it a go, and let me know. In summary, OptArgs never go in as arguments into AST terms. Always, by Term.OptArg(...) or .Term[ optArgAnonType ].

If that solves your issue, hit the close button with a vengeance. :+1:

:hourglass_flowingsand: :mag: [**"But I still haven't found what I'm for..."**_](https://www.youtube.com/watch?v=wdCJRybAtso)

bchavez commented 8 years ago

Also, @stefanprodan , have you signed the RethinkDB CLA? I'd like to add some of your code in this issue as a unit test.

stefanprodan commented 8 years ago

I write my queries in Data explorer then in code, so I presumed I can send the index as group param. I've should've look in the Java docs. Thanks again for helping me.

I've signed the CLA :ok_hand:

bchavez commented 8 years ago

Hey @stefanprodan . No problem, if you need any help I'm usually available on Slack same sn @bchavez. And thanks again for the bug report and using our driver. :+1:

:chocolate_bar: :cookie: :lollipop: _Ronald Jenkees - Stay Crunchy_