ISG-ICS / cloudberry

Big Data Visualization
http://cloudberry.ics.uci.edu
90 stars 82 forks source link

Cloudberry does not change datatype "bag" to "string" after unnest it #563

Closed lengyyy closed 6 years ago

lengyyy commented 6 years ago

Background

When implement the hashtag tendency chart moudle #558, I want to write a query which only giving result of a specific hashtag, but it failed.

I add a filter contains hashtagName to the query: filter.push( { field: "hashtags", relation: "contains", values: [hashtagName] }. The datatype of hashtags is bag, and according to the document of cloudberry, it should support the filter contains. image

But the server give an error: image

And check the source code AsterixQueryGenerator.scala: 177, we find that the "contains" filter of datatype "bag" is not implemented in Cloudberry now.

Solution

For task #558, temporary solution is using javascript to do the filter job, but this solution is not possible in big dataset. Therefore, we need to solve this problem, and then continue to work on issue #558.

For this issue, the solution is to modify cloudberry, so that it can support filter of string datatype after unnest is applied to bag datatype. There are two tasks:

Experiment

  1. The json send to cloudberry is image

  2. The query sent to AsterixDB by cloudberry is select `month` as `month`,coll_count(g) as `count` from twitter.ds_tweet_b5c0b187fe309af0f4d35982fd961d7e t unnest t.`hashtags` `unnest0` where not(is_null(t.`hashtags`)) and t.`create_at` >= datetime('2017-01-24T08:00:00.000Z') and t.`create_at` < datetime('2018-01-04T08:00:00.000Z') and t.`geo_tag`.`stateID` in [ 37,51,24,11,10,34,42,9,44,48,35,4,40,6,20,32,8,49,12,22,28,1,13,45,5,47,21,29,54,17,18,39,19,55,26,27,31,56,41,46,16,30,53,38,25,36,50,33,23,2 ] and `unnest0`="love" group by get_interval_start_datetime(interval_bin(t.`create_at`, datetime('1990-01-01T00:00:00.000Z'), year_month_duration("P1M") )) as `month` group as g;

  3. The query result is image

To Do

baiqiushi commented 6 years ago

586