ISG-ICS / cloudberry

Big Data Visualization
http://cloudberry.ics.uci.edu
90 stars 82 forks source link

Provide a point-view for the geo-location tagged tweets #157

Closed JavierJia closed 6 years ago

JavierJia commented 8 years ago

plot the tweets by it's geolocation as a point.

It could be implemented by using the geocell function of the new data model.

JavierJia commented 8 years ago

The aggregation result is easy to show. The harder problem is to show the information on data points. The details Query could be implemented as group by the geocell function and get the "topK" most recent tweet within each group.

JavierJia commented 8 years ago

The "topK" may not serve the purpose directly. What we need is a "top10%" instead of "top100" result. Does *db support this query?

JavierJia commented 8 years ago

@simonmssu

JavierJia commented 8 years ago

Get an idea from a discussion in SIGSPATIAL: issue an aggregation query for each very small cell; zoom-in can be done with another query with even smaller cells, etc.

JavierJia commented 8 years ago

@zonghengma we can do it step by step. The first step just show all points within a small region. e.g., when we zoom to a layer finner then city, we show all points in that region.

One problem is that the sample data may not have many point data.

JavierJia commented 7 years ago

@HotLemonJuice , could you move the issue #274 content you just created to here? thanks!

ShengjieXu commented 7 years ago

So far we are using a heat map to show the Tweets on the map. It is a high-level description of the data.

Now if we can show the exact location of each Tweet, we can make the connection between the data and the geolocation closer.

A good example by MapD: https://www.mapd.com/demos/tweetmap/ screen shot 2017-03-06 at 16 41 41

However, since there are too much data to load in the browser, we need use some tricks to avoid crashing the browser. We can start implementing this feature in small areas of the map, e.g. city level.

ShengjieXu commented 7 years ago

There aren't much point data in the sample data. How do I add fake data into the DB?

JavierJia commented 7 years ago

we can generate a sample data. for now, you may use the field of "bounding_box": rectangle("-74.146932,40.643773 -74.0658,40.697794") }? you can just pick the first value, and add some random value.

JavierJia commented 7 years ago

you can take a look at the sample.data under script folder. Here is an example of the place field

"place": { "country": "United States", "country_code": "United States", "full_name": "Houston, TX", 
"id": "1c69a67ad480e1b1", "name": "Houston", "place_type": "city", "bounding_box": 
rectangle("-95.823268,29.522325 -95.069705,30.154665") }
ShengjieXu commented 7 years ago

Notes for future dev:

Now, I'm switching off to Normalization as it is more urgent