google-code-export / openhab

Automatically exported from code.google.com/p/openhab
GNU General Public License v3.0
0 stars 0 forks source link

Make strings persistable with sql (again) #441

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
With the actual version of (My)sql.persistence it's not possible to persist 
strings or logical values like ON/OFF/OPEN/CLOSED and so on. Both capabilities 
would be very useful, for example for logging incoming calls (with name and 
number) or logging doorbell, window and gate stats and so on.

Another Request... At the moment, timestamp is long (ticks since 1970?), so it 
has to be recalculated if values are read from (My)sql-table. (My)sql has 
datetime-var capability, if long is necessary, there could even be a third item 
for this purpose in each table, so that the table itself is almost 
human-readable.

Original issue reported on code.google.com by udo1t...@gmail.com on 12 Sep 2013 at 2:51

GoogleCodeExporter commented 9 years ago
I'll take a look at this shortly.

Some explanation - the dates as milliseconds since 1970 is just the default 
format used in both Java, and Javascript. This means that it's quicker to 
convert within the software which can make a difference for large queries. This 
is especially important for a javascript client where this conversion can make 
a huge difference (ie graphing).
It may be possible to use the DATETIME in SQL, but have the JSBC connector 
return a long anyway - I'll take a look.

For the strings, I'll look to add a string version of the value. Personally, I 
think numeric values are better, and then use mapping to translate this in a 
client (for display/graphing). Using a numeric value allows statistics to be 
gathered easier (IMO).

Any comments appreciated - I'll try and look at this in the next few days 
(hopefully this weekend, but if not it may be next weekend).

Original comment by ch...@cd-jackson.com on 14 Sep 2013 at 1:45

GoogleCodeExporter commented 9 years ago
the problem of translation is, that, for instance, telephone numbers and the 
corresponding names are dynamic, and in most cases you do not have a static 
list to translate to. ;-) For static texts as open, closed, on, off and so on i 
agree with you.
About the timestamp, maybe a automatically generated third field with 
sql-timestamp could do the job. 
I manually altered the tables: "ALTER TABLE `Itemxx` ADD `timestamp` TIMESTAMP 
NOT NULL DEFAULT CURRENT_TIMESTAMP" so that each table automatically gets a 
timestamp from the mysql-Server. Enough for my needs...

Original comment by udo1t...@gmail.com on 14 Sep 2013 at 2:19

GoogleCodeExporter commented 9 years ago
Sorry, haven't seen this issue.
Opened a new Issue for boolean values with a simple Solution-Patch: 
http://code.google.com/p/openhab/issues/detail?id=451

Strings would be very useful, too!
- Formated Telephone Numbers/Strings...
- speaking Texts or GUI text output

But for Graphing always needed numeric values.

Original comment by Helmut.L...@gmail.com on 14 Sep 2013 at 3:46

GoogleCodeExporter commented 9 years ago
Just thinking aloud. I guess there are 3 options -:
1) Add both string and double columns into the table
2) Just use a CHAR (or VARCHAR) column and store all data as strings
3) Adjust the table schema depending on the item type

1 is wasteful, but _may_ help numeric query speed.
2 is possibly easiest and most flexible
3 is possibly the best, but seemingly more complex

I think 3 is the best approach. I think (with minimal research) that the only 
place we need to check the type is when the table is created. When writing 
data, we cast to a string and let JDBC sort out the casting to the correct data 
type in MySQL. When reading back data we should already convert to the correct 
item type, so this will again sort this out (again partly calling on the JDBC 
type functions).

I would suggest the following item mapping -:

Color             VARCHAR(50)
Contact           VARCHAR(50)
DateTime          DATETIME
Dimmer            DOUBLE
Group             VARCHAR(50)
Number            DOUBLE
Rollershutter     DOUBLE
String            VARCHAR(50)
Switch            DOUBLE

Does that sound ok - or is it too complex? I think due to the JDBC type 
mapping, the complexity is largely hidden, so it should be fine.

Original comment by ch...@cd-jackson.com on 15 Sep 2013 at 2:29

GoogleCodeExporter commented 9 years ago
Following on from this, I've done some more testing and the concept seems to 
work.

I propose the following to make it configurable. In the openhab.cfg file, we 
have the following config settings -:
sql.TYPE=mysql-type
where-:
TYPE is the item type (eg Color, Contact, Number, String...)
mysql-type is the type string that is used for the table of that item type. I 
would default it to the above (unless there's a better suggestion). This allows 
(hopefully) a reasonable set of defaults, but you can then override this easily 
in the config file.

In addition, the "Time" column will be changed to the mysql DATETIME type.

Original comment by ch...@cd-jackson.com on 15 Sep 2013 at 4:39

GoogleCodeExporter commented 9 years ago
Just to clarify, the config file setting is -:
   sql:sqltype.TYPE=MYSQLTYPE

e.g.
   sql:sqltype.string=char(20)
   sql:sqltype.number=int

Original comment by ch...@cd-jackson.com on 15 Sep 2013 at 5:11

GoogleCodeExporter commented 9 years ago
The changeset is here -:

https://code.google.com/r/chris-openhab/source/detail?r=e53130f3bdb3fbe3a1e7a6cc
29937b7aad10e1ab

Original comment by ch...@cd-jackson.com on 15 Sep 2013 at 10:16

GoogleCodeExporter commented 9 years ago
Thanks!  

Have tested it:
1. Why "Contact" as VARCHAR(50), isn't it a boolean?
2. Why booleans (Switch, Contact, Rollershutter) as DOUBLE and not as BIT?
3. Same as before, booleans (Switch, Contact, Rollershutter) can't be stored 
whiteout manual Type conversation: "Could not store item 'DI06_bool' in 
database with statement 'INSERT INTO Item23 (TIME, VALUE) VALUES(NOW(),'ON');': 
Data truncated for column 'Value' at row 1"
Mysql only can do automatic conversation with type java.lang.Boolean - 
http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-type-conversions.ht
ml

It's not optimal, but a Trick is needed to convert boolean String-States to BIT 
for storing and same back from BIT to boolean String-States.
Kai did the Trick in org.openhab.persistence.rrd4j.internal.RRD4jService with 
the Funktions getConsolidationFunction and mapToState.

The very simplified way for store Booleans:

        /**
         * Converts ItemState to a String
         * @param item
         * @return String
         */
        private String getStateAsString(Item item){
            if(     item instanceof SwitchItem || 
                    item instanceof RollershutterItem || 
                    item instanceof ContactItem ) {
                return item.getStateAs(DecimalType.class).toString();
            }
            return item.getState().toString();
        }

Original comment by Helmut.L...@gmail.com on 16 Sep 2013 at 5:22

GoogleCodeExporter commented 9 years ago
>1. Why "Contact" as VARCHAR(50), isn't it a boolean?
I don't currently have any "contact" types, so thought that these could be 
stored as "OPEN/CLOSED" etc. I'm happy to type convert this to BOOLEAN. 
Alternatively, they could stay as a char, and the graphing/analyses needs to be 
adjusted (?).

>2. Why booleans (Switch, Contact, Rollershutter) as DOUBLE and not as BIT?
I did look at other data types, and decided to keep them standard (ie the same 
as Number), but this isn't necessary. For Rollershutter (which I don't have), I 
assumed this was like dimmer and could be a number from 0 to 100%. This could 
still be an int, but I wasn't sure if someone would want decimal places.

I'm happy to put the type conversion in for booleans. Do I understand right 
that this same code is used in the RRD service? I do question the rollershutter 
as a boolean though - are you sure this can't be a value like a dimmer?

The alternative for boolean is to store as a string. As a matter of interest, 
did you try changing any data types in the config file?

Original comment by ch...@cd-jackson.com on 16 Sep 2013 at 7:47

GoogleCodeExporter commented 9 years ago
On the other hand... In the DB40 binding, all data is stored as strings - there 
is no conversion performed. So, I guess "Contact" will be stored as OPEN/CLOSED 
etc.

In the RRD service, while you are correct that the contact gets converted to a 
boolean, what you missed is that when the query is performed, the conversion 
back to the state (ON/OFF etc) is performed in reverse. I think this is done as 
RRD can only store numbers. With mySQL, we don't have that limitation.

In my opinion the strongest argument either way is that there should be 
consistency across persistence services at interface level - not at storage 
level. This way, a client can be written that will work with different 
persistence services without having to know the details of each service. At 
least in the above 2 (RRD and DB4O) this is consistent at the software 
interface. The fact that the 2 services store data differently is not relevant 
since this gets hidden by the persistence service.

So, I think that Contacts etc should be stored in their native format as 
strings. If we convert to a number, we'd then need to convert back to the state 
in the query - both needless operations that take time and will slow down 
queries where this may be done 1000s of times per query.

I'm not 100% sure which types need to be CHAR and which can be number though, 
but I think the above should be the concept. One option is to store all types 
that aren't obviously numbers (eg number/dimmer), as a CHAR. This would be 100% 
sure to maintain state, but isn't optimum from a storage point of view.

Thoughts? :)

Original comment by ch...@cd-jackson.com on 16 Sep 2013 at 10:56

GoogleCodeExporter commented 9 years ago
Updated changeset to fix a bug in the query function -:

https://code.google.com/r/chris-openhab/source/detail?r=141f1fe38d848c5adfe182bf
69dbfe49296c2ba5

Original comment by ch...@cd-jackson.com on 16 Sep 2013 at 9:17

GoogleCodeExporter commented 9 years ago
"should be consistency across persistence services at interface level" 
I agree 100%!!!!

Original comment by Helmut.L...@gmail.com on 17 Sep 2013 at 2:54

GoogleCodeExporter commented 9 years ago
I've tested your changeset. It is working fine. My booleans get stored in the 
database again. I also was able copy my old values to the new table structure.

Yet I have a (SQL) question. For a given item name, I'd like to access the 
matching table date. Thus I have to query the id and then use this id together 
with the string "Item" as a table name in my query. Can I do this in one query, 
or do I have to make two queries, one for the id, and another one with the 
combined string?

Original comment by airmax....@gmail.com on 20 Sep 2013 at 7:26

GoogleCodeExporter commented 9 years ago
I've tested with 1.3.0-498 and it works as expected for Contacts, but not for 
Switches, if the Switch Value is ON or OFF. Aparently the Values are not 
translated properly.

Original comment by udo1t...@gmail.com on 28 Sep 2013 at 9:39

GoogleCodeExporter commented 9 years ago
I suspect that "Switch" type probably ought to be saved as a CHAR.

I'm not sure at the moment if the only allowable states are ON/OFF - if so, 
CHAR(3) would do?

I'll try and test this later today.

Original comment by ch...@cd-jackson.com on 29 Sep 2013 at 9:28

GoogleCodeExporter commented 9 years ago
If I understand the source correctly, a switch can be one of two types - OnOff, 
or UnDef. I think therefore the possible values can be 
ON/OFF/Undefined/Uninitialised (assuming I understand what's happening under 
the hood correctly!).

So, I have defaulted Switch to CHAR(15) and tested it ok. I'll push this on 
GitHub.

Original comment by ch...@cd-jackson.com on 29 Sep 2013 at 2:04

GoogleCodeExporter commented 9 years ago
I've spent some more time looking more at this - this time from the query side, 
to make sure that data is stored in the best format to allow it to be extracted.

Below is a table showing the different conversions that are applied. Since most 
"Items" can return different "Types" of data, there is a conversion in some 
cases to ensure the correct (best?) format is persisted. The following are the 
relationships -:

 * Item-Type          Data-Type       MySQL-Type
 * =========          =========       ==========
 * ColorItem          HSBType         CHAR(25)
 * ContactItem        OnOffType       CHAR(3)
 * DateTimeItem       DateTimeType    DATETIME
 * DimmerItem         DecimalType     TINYINT
 * NumberItem         DecimalType     DOUBLE
 * RollershutterItem  DecimalType     TINYINT
 * StringItem         StringType      VARCHAR(50)
 * SwitchItem         OnOffType       CHAR(3)

In the store method, type conversion is performed where the default type for an 
item is not as above. For example, DimmerType can return OnOffType, so to keep 
the best resolution, it is stored as a number in SQL and convert to DecimalType 
before persisting to MySQL. 

The above SQL types should be consistent with the data types. Eg - Percent type 
is internally stored as an integer (limited between 0 and 100). ColorItem is a 
complex string made up of "decimal,percent,percent", so 25 characters should be 
fine. 

OnOff can only be ON or OFF - contrary to my earlier statement that they can be 
undefined or uninitialised, I don't think these states can ever be persisted.

If anyone has any queries, please let me know otherwise I will implement the 
above and test as much of it as I can.

Original comment by ch...@cd-jackson.com on 29 Sep 2013 at 4:20

GoogleCodeExporter commented 9 years ago
Attached snapshot build for testing.

Original comment by ch...@cd-jackson.com on 29 Sep 2013 at 6:48

Attachments:

GoogleCodeExporter commented 9 years ago

Do I understand right?
RollershutterItem  -> UpDownType     -> boolean: UP, DOWN;     in mySQL mapped 
to = 100, 0;
ContactItem        -> OpenClosedType -> boolean: OPEN, CLOSED; in mySQL mapped 
to = 1, 0;

In org.openhab.core.library.types.StringType String length is not limited, You 
does restrict Length to 50.
StringItem         StringType      VARCHAR(50)
Isn't it better to use TEXT without restricted Length here?!

I personally use StringItem sometimes to cache Data from Http-Return, so I 
would prefer TEXT.

Original comment by Helmut.L...@gmail.com on 30 Sep 2013 at 12:08

GoogleCodeExporter commented 9 years ago
Currently the up/down, and open/closed are stored as CHAR. They might be able 
to be stored as binary - I'd need to check the mapping in openHAB 
(items/types). I will do this tonight.

Regarding Strings - I would still keep the type as VARCHAR since it is 
apparently a lot faster to use than TEXT. Both types allow extended lengths - 
VARCHAR is up to 65535 characters. I think in order to get the full 65k 
characters, VARCHAR(256) is to be used, so we can change to this?

For reference on TEXT v VARCHAR, you can see the following link -:
http://www.pythian.com/blog/text-vs-varchar/

Original comment by ch...@cd-jackson.com on 30 Sep 2013 at 1:29

GoogleCodeExporter commented 9 years ago
Yes, VARCHAR uses only the length which is currently needed plus 1 byte for the 
length.
So, if StringType has no length restriction, you can use VARCHAR(65535) without 
any disadvantage.

In real live it often not needed to have more than 256 chars. But what is 256 
Chars when I will store the content of HTTP-Request.

Eventually, sql:maxlength can be optional configurable... ;-)

Original comment by Helmut.L...@gmail.com on 30 Sep 2013 at 4:17

GoogleCodeExporter commented 9 years ago
Agreed (nearly). I'll use VARCHAR 65500 - you can't have 65535 since the 
maximum size of the data is 65535 bytes, and 2 bytes are used for the length, 
and potentially other bytes for other things (eg null fields). So, I think it's 
safer to limit it slightly rather than hit an error.

Regarding the OPEN/CLOSED, ON/OFF, these could be converted to boolean, or 
another number. My preference is to stick with CHAR(x) for simplicity, but if 
others have a strong preference we can change to boolean or integer. Ultimately 
when the data is queried in openHAB, it needs to get converted back to 
OPEN/CLOSED, ON/OFF.

Also, note that the mysql binding will change (slightly). There's been 
discussion elsewhere relating to PostgresSql, and to make things easier I will 
remove the sql binding, and we will just have the mysql binding. This will 
likely mean some changes in the openhab.cfg, but not much (probably just the 
name).

For more info on this, see https://github.com/openhab/openhab/pull/3

Original comment by ch...@cd-jackson.com on 1 Oct 2013 at 4:59

GoogleCodeExporter commented 9 years ago
VARCHAR 65500 is a lot of wood, even in UTF-8. For me now, all Parameters are 
perfect!
Thanks.

Original comment by Helmut.L...@gmail.com on 1 Oct 2013 at 5:34

GoogleCodeExporter commented 9 years ago

Original comment by teichsta on 5 Nov 2013 at 10:53

GoogleCodeExporter commented 9 years ago
This issue has been migrated to Github. If this issue id is greater than103 its 
id has been preserved on Github. You can open your issue by calling the URL 
https://github.com/openhab/openhab/issues/<issueid>. Issues with ids less or 
equal 103 new ids were created.

Original comment by teichsta on 17 Nov 2013 at 8:08

GoogleCodeExporter commented 9 years ago
see above!

Issue has been migrated to Github and should be discussed there.

Original comment by teichsta on 21 Nov 2013 at 1:51