LSIR / gsn

Global Sensor Networks
GNU General Public License v3.0
58 stars 44 forks source link

NOSQL support for GSN #66

Closed varks closed 8 years ago

varks commented 8 years ago

Hi,

Postgres 9.4 has recently added support for jsonb datatype which stores entire json data as a blob and supports querying (using enhanced SQL)on nested data fields in the JSON object. I was thinking on adding such a SQL query to virtual sensor to filter out required fields. If we have to achieve this, I believe we have to modify major part of storage layer of GSN. Do you think this is feasible. Any thoughts on this will be helpful.

Thanks.

ebiiii commented 8 years ago

Hi varks,

If I understood correctly, you want to store the content of a stream element (the various fields attached with a timestamp) as a JSON object in a jsonb type column. Keeping the timestamp as another column in the row (or being the key for NoSQL) ? The biggest problem I can see so far is the fact that GSN allows to specify queries for the sliding windows directly in the configuration files and is based on SQL. So one very first step (that I had in mind for sometime now) would be to translate the query in some abstract representation and only translate it to actual SQL at the very last step before running a query on the DB. This would allows to generate other kind of queries and support NoSQL DB. The second thing to take care of would be to write a StorageManager implementing https://github.com/LSIR/gsn/blob/master/src/main/java/gsn/storage/StorageManager.java . But here the issue would be that there is no schema in the jsonb column telling GSN about the field types. So you would need some tricks to force GSN to accept it.

Another approach would be to add a new DataType to GSN that would act a bit like the binary type, for JSON. This would be the easiest way to do it. You could even include your enhanced SQL queries directly in the virtual-sensor config files and it should work out-of-the-box (to be tested).

The issue with this second approach is that from the point of view of GSN, there is a single field (of JSON data) that cannot be used for an averaging sliding window for example and the web interface will be displaying only the JSON text. But you can of course interpret the JSON content in you own virtual sensor processing class to perform your computations.

Let me know if this wasn't what you meant or if you have any further question.

varks commented 8 years ago

Hi ebii,

Thanks for the quick reply. I was thinking on the lines similar to second approach where I could add a new Datatype and insert the entire JSON object into the jsonb field. The enchanced SQL queries for postgres allows one to use special path operators in the SQL query to query certain fields inside the jsonb object. I was wondering whether this query would work out of the box if specified in the virtual -sensor config files. Are the SQL queries that are specified in the virtual-sensor config files executed verbatim ? I believe they should work and will test them out. Do you see any other red flags in this approach.

Thanks.

ebiiii commented 8 years ago

I think it is worth trying. The queries are just slightly rewritten (changing the table name here: https://github.com/LSIR/gsn/blob/master/src/main/java/gsn/storage/SQLUtils.java#L110) but it should be fine with the JSON part of the query. So you will have to define a new data type here: https://github.com/LSIR/gsn/blob/master/src/main/java/gsn/beans/DataTypes.java Then allow the PostgreSQL manager to use this type here: https://github.com/LSIR/gsn/blob/master/src/main/java/gsn/storage/db/PostgresStorageManager.java And have a wrapper that generate JSON data.

So far I don't see any major problem. I hope it will work ;-)

varks commented 8 years ago

Thanks much ebiii :) I will let you know if I hit a roadblock.

shebajijugeorge commented 8 years ago

i want to implement an algorithm using the sql database from gsn according to some user input but i dont know how to connect the sql database with the gsn???? can u suggest a method????

ebiiii commented 8 years ago

@shebajijugeorge please could you move your question to a new issue?

varks commented 8 years ago

Hi Ebii,

I have started working to add new datatype to Datatypes.java. But am not really sure how to add this new datatype. jsonb is neither a SQL type or GSN type. Any pointers or examples on this would be helpful. Thanks again.

I am trying to replicate the binary type.

` // NEXT FIELD public final static String JSONB_PATTERN_STRING = "\s_jsonb" + OPTIONAL_NUMBERPARAMETER + "(\s:.*)?";

public final static byte JSONB = 10;

public final static String JSONB_NAME = "Jsonb"; `

I am not sure what to fill in here: sample jsonb value : there is no corresponding SQL type.

public final static Object [ ] TYPE_SAMPLE_VALUES = { "A chain of chars" , 'c' , new Integer( 32 ) , new Integer( 66000 ) , new Byte( ( byte ) 12 ) , new Double( 3.141592 ) , new Date( ).getTime( ) , new Integer( 1 ) , new Integer( 9 ), new Float(2.02) };

public static byte SQLTypeToGSNTypeSimplified(int sqlType) { if (sqlType == Types.BIGINT || sqlType == Types.SMALLINT || sqlType == Types.DOUBLE || sqlType==Types.INTEGER || sqlType == Types.DECIMAL||sqlType == Types.REAL || sqlType == Types.FLOAT|| sqlType == Types.NUMERIC ) return DataTypes.DOUBLE; else if (sqlType == Types.VARCHAR || sqlType == Types.CHAR|| sqlType == Types.LONGNVARCHAR || sqlType == Types.LONGVARCHAR || sqlType== Types.NCHAR ) return DataTypes.VARCHAR; else if (sqlType == Types.BINARY || sqlType == Types.BLOB|| sqlType == Types.VARBINARY ) return DataTypes.BINARY; throw new RuntimeException("Can't convert SQL type id of: "+sqlType+ " to GSN type id."); }

I do not see the above function being called from anywhere.

Also, I cannot map it directly in convertLocalTypeToGSN(int jdbcType, int precision). It looks like the code is tightly coupled.

@ebiiii Any thoughts on this ? Varks

ebiiii commented 8 years ago

Hi Varks,

Sorry for the late reply, I was away last week. Did you succeed to do the mapping of GSN types?

varks commented 8 years ago

Hi Julien,

No problem. I got it working and hence closed the issue.

Thanks much for your help.

Thanks, Varun

On Mon, Mar 7, 2016 at 2:27 AM, Julien Eberle notifications@github.com wrote:

Hi Varks,

Sorry for the late reply, I was away last week. Did you succeed to do the mapping of GSN types?

— Reply to this email directly or view it on GitHub https://github.com/LSIR/gsn/issues/66#issuecomment-193196570.

Regards, Varun K S