Store data locally / Influx DB support

johanmeijer commented 3 years ago

It would be nice to be able to store the Growatt data locally.

Adding and maintaining a file structure for this is complex. With the InfluxDB time series Database (Open Source, also available on docker: https://www.influxdata.com/products/influxdb/ ) it is very easy to store unstructured (JSON) data. With INFLUXDB policies the data can easily be reduced (by continuous queries) or expired (retention policies) .

It also provides easy reporting with tools like Grafana (https://grafana.com/).

Can an interface to influxdb be added to Grott?

a-dekker commented 3 years ago

Writing to InfluxDB can be achieved in a very simple way. Just add a few config lines to the Telegraf (server agent of influxdata) config to subscribe to mqtt and make it write in (JSON format) InfluxDB. That's the way I handle my Grott data at least. If you want more info on this I can provide you my config.

johanmeijer commented 3 years ago

Hi @a-dekker

Yes please!

I started with creating version 2.3.0 of Grott with a direct feed into influxdb (via API) but this sound also as a good solution.

a-dekker commented 3 years ago

Here is my telegraf.conf config (everything is running on localhost)

[[inputs.mqtt_consumer]]
  servers = ["tcp://localhost:1883"]
  topics = [
    "energy/growatt",
  ]
   data_format = "json"
    [[outputs.influxdb]]
      database = "solar_metrics"
      urls = ["http://localhost:8086"]
      namepass = ["*_consumer"]

johanmeijer commented 3 years ago

Ziet er eenvoudig uit ga ik proberen. Dank je wel!

Oops dutch (I do not now why). Looks simple I am going to try.

johanmeijer commented 3 years ago

Hi @a-dekker it works! But with some challenges. I miss some fields: device, buffered and maybe time(?). That might not be a problem for you but can be necessary if you have multiple inverters. This can easily be solved by adding a statement in your config:

[[inputs.mqtt_consumer]]
  servers = ["tcp://localhost:1883"]
  topics = [
    "energy/growatt_test",
  ]
   data_format = "json"
   tag_keys = ["buffered","device","time"]

[[outputs.influxdb]]
   database = "grottdb"
   urls = ["http://localhost:8086"]
   namepass = ["*_consumer"]

Time and its in relation with buffered is separate beast. The influx timestamp with the above config is the actual timestamp is received by influx. That is ok for normal use. But most of the growatt inverters (dataloggers like shinewifi) are capable of buffering records if there is no connection with the growatt server. If the connection is back it will sent the buffered records (with time indication).Grott uses this time information and put it in the "time" key and set the "buffered" key to yes. The above config will set the influx timestamp to the actual time. Also for buffered records. So you have to use the "time" key (and "buffered" key) if you want to have the complete picture.

You can also use the value in the time field as influx time bij adding the following statements to the input part:

json_time_key = "time"
json_time_format = "2006-01-02T15:04:05"

And now telegraf will pass the Grott time to influx. But........ for influx it is the UTC time and it will add (substract) time zone and that makes this actually unusable. This can only works I think if everything is running with UTC (Inverter, Grott and Influx/telegraf). I am not be able to find a solution for this yet. There are no settings for it in influx / telegraf so this needs to be solved at Grott level I think. If we want to solve it, for most people your configuration (with the tags_keys added to it) will work.

johanmeijer commented 3 years ago

I am still thinking about a direct feed from Grott into Influx. The above mentioned time issue's makes me less enthusiastic about this but maybe we can find a solution for this.

For the people we are using Grott with MQTT, the solution with influx / telegraf will be sufficient most of the time. For more information: on telegraf see:

https://www.influxdata.com/time-series-platform/telegraf/

a-dekker commented 3 years ago

Thanks for your research and resulting findings! Indeed time series databases normally expect their input data to be related to the actual time, so that can be an issue when data is delivered later. I would say go for your own interface if you can find the time for that, I won't mind testing it myself.

johanmeijer commented 3 years ago

Version 2.3.0 (now in beta-test as separate branche) supports influxdb

Please test it !!!!!!

@a-dekker , @harryverkooijen

a-dekker commented 3 years ago

I decided to continue using my existing influx database/user/userpwd and I got it working, but I had to remove some code, else Influxdb use got disabled.

My issue is that I can't do a "show databases". It's not an explicit permission error, but it does not work. I recently migrated from Influxdb 1 to Influxdb 2, no idea what version you are using and if it makes a difference.

This is how I can reproduce it:

11:33 $ python3
Python 3.9.1 (default, Dec  8 2020, 07:51:42)
[GCC 10.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from influxdb import InfluxDBClient
>>> client = InfluxDBClient(host='127.0.0.1', port=8086, username='mygrottuser', password='mygrottpwd')
>>> client.get_list_database()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb/client.py", line 704, in get_list_database
    return list(self.query("SHOW DATABASES").get_points())
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb/client.py", line 534, in query
    data = response.json()
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/requests/models.py", line 900, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.9/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.9/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.9/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

johanmeijer commented 3 years ago

Thank you for testing.

I use the showdatabases to see if the database exits and if not to create it (but you already saw that in the code I think).

I can not recreate your problem at this moment. I use an old version of influxdb indeed (running at a raspbery pi). I will try it with a newer version.

Other thought: can it be security related? My user has "god" rights within influx. Maybe that is not the best idea. I will try that also.

a-dekker commented 3 years ago

Security rules are more strict in Influxdb 2.0. I used to have "auth-enabled = false" so I could write anonymous via http, but that's no longer allowed. And I also explicitly allow users for their specific databases only.

johanmeijer commented 3 years ago

Influxdb 2.0 needs another python library:

https://github.com/influxdata/influxdb-client-python

I have to see how I can support both Influx versions (or only 1.8+ version but then I have to update my influx database or better my total RPI)

InfluxDB >1.8 can work with both python libraries I think.

johanmeijer commented 3 years ago

Hi @a-dekker which release of InlfuxDB are you running? I do understand that version 2.0 is not finished yet and still under development.

I have installed version 2.0.3. I do understand this is at this moment the latest (most actual) Release Candidate (not official release yet) and the API should now be more compatible with the 1.x versions. I have not get it working fully yet but can not recreated your problem. The message I get seems more logical to me:

 influxdb.exceptions.InfluxDBClientError: 401: {"code":"unauthorized","message":"Unauthorized"}

Now I only have to find out how to get the right authorizations. I am not sure that user / password will work. I might need a token.

For the ones who want to use / test with Influxdb. Grott is tested (and working with) influxdb 1.0.5 and 1.8.3 (the most actual, latest, release).

a-dekker commented 3 years ago

InfluxDB 2.0 is no longer in alpha/beta state. It is "Generally Available" since the 10th of November 2020. I avoided the earlier 2.x alpha releases. At the moment I have version 2.0.2 up and running.

I do get your permission error message when I connect with other users btw. Tokens are more commonly used in 2.0 instead of passwords as far as I can tell. There should be a "general" token in the ~/.influxdbv2/configs file.

InfluxDB 2 will become more mainstream in due time, as will the new Flux query language. Grott is working fine here with InfluxDB 2.0, apart from the show/create database part, which I don't need since I already had a user/database present. You could delay InfluxDB 2.0 supoprt for now, or skip the show/create database for InfluxDB 2.0 as I workaround (including a warning perhaps).

johanmeijer commented 3 years ago

@a-dekker. I have implemented influxdb V2 support in Grott 2.3.1. You can find this version in the grott_beta branche.

New parameters in the [influx] section of the .ini file

influx = True
influx2 = True
token  = "influx_token"
org  = "your org"
bucket = "your bucket"

You have to enable both influx and influx2 (influx = True and influx2 = False will enable influxdb v1 processing)

A new influx python library needs to be installed:

 [sudo] pip3 install influxdb-client

For influxdb v2 the bucket (database) needs to be available (grott will not create one).

a-dekker commented 3 years ago

Will test it! But it seems you did not push branch 2.3.1 yet?

johanmeijer commented 3 years ago

Sorry for not being clear: it is branch grott_beta. (I wanted to get rid of the version nummer in the beta).

a-dekker commented 3 years ago

I can't get it to work for now. On urllib3 level it starts complaining it misses a hostname. This is how I can reproduce it:

Python 3.9.1 (default, Dec  8 2020, 07:51:42) 
[GCC 10.2.0] on linux
>>> from influxdb_client import InfluxDBClient
>>> from influxdb_client.client.write_api import SYNCHRONOUS
>>> influxclient = InfluxDBClient(url="localhost:8086",org="", token="mytoken")         
>>> ifbucket_api = influxclient.buckets_api()     
>>> iforganization_api = influxclient.organizations_api()     
>>> ifwrite_api = influxclient.write_api(write_options=SYNCHRONOUS)     
>>> buckets = ifbucket_api.find_bucket_by_name("solar_metrics")   
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb_client/client/bucket_api.py", line 83, in find_bucket_by_name
    buckets = self._buckets_service.get_buckets(name=bucket_name)
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb_client/service/buckets_service.py", line 501, in get_buckets
    (data) = self.get_buckets_with_http_info(**kwargs)  # noqa: E501
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb_client/service/buckets_service.py", line 586, in get_buckets_with_http_info
    return self.api_client.call_api(
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb_client/api_client.py", line 340, in call_api
    return self.__call_api(resource_path, method,
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb_client/api_client.py", line 170, in __call_api
    response_data = self.request(
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb_client/api_client.py", line 362, in request
    return self.rest_client.GET(url,
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb_client/rest.py", line 259, in GET
    return self.request("GET", url,
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/influxdb_client/rest.py", line 230, in request
    r = self.pool_manager.request(method, url,
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/urllib3/request.py", line 74, in request
    return self.request_encode_url(
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/urllib3/poolmanager.py", line 364, in urlopen
    conn = self.connection_from_host(u.host, port=u.port, scheme=u.scheme)
  File "/home/user/.local/share/virtualenvs/grott/lib/python3.9/site-packages/urllib3/poolmanager.py", line 236, in connection_from_host
    raise LocationValueError("No host specified.")
urllib3.exceptions.LocationValueError: No host specified.

Not really Grott related it seems.

Additional question: do I have to use a token, or can I still use username/password? The latter is still supported in InfluxDBv2.

Additional remark: I am not really happy the fact that Grott disables influxdb on error and then continues. The small log message can easlily be missed and later on you notice your data is missing in Grafana. I prefer to get the actual error and if I can't fix it I can always disable Influxdb in the settings myself. Or add a parameter like disable_influxdb_on_error = True/False perhaps.

johanmeijer commented 3 years ago

Oke. I can not reproduce this error at this moment (with my test configuration). The problem seems to be that the python influxdb library does not like localhost as IP address. Probably if you specify the IP address of the server influxdb is running on it will work.

I will look if I can replace localhost in a valid IP address in the coding (probably replace it with 0.0.0.0 will work).

As far I can see with my tests, the new python influxdb client needs for communication with influxDB V2: IP address with port, token, bucket and organization.
With influx2 = true specified Grott will not use user / password any more.

I think it is a valid remark that it is better to stop Grott on an error with inlfuxdb than proceed. In my situation the most iimportant interface is MQTT and influxdb is a nice to have. But I agree if you specify influx = true I should assume you really want Grott to run with Influxdb processing enabled.

Thank you for testing and the remarks!

johanmeijer commented 3 years ago

Tested it and specifying ip = 0.0.0.0 (in stead of localhost) seems to so solve the problem. I will put that in the code.

johanmeijer commented 3 years ago

@a-dekker I uploaded 2.3.1a to the grott_beta branche.

This will set influxdb ip address to 0.0.0.0 if localhost is specified (=default) Grott processing wil be stopped if influxdb initialization fails.

a-dekker commented 3 years ago

The localhost issue indeed is solved!

Next thing I stumbled upon was the fact that I did not define an organization when I converted my data from InfluxDB 1 to InfluxDB 2. Grott did not start but gave me an error (so that also work now :-)). Removing the check for an organization led to missing org/orgid when inserting, so I added an organization name (actually renamed it from None to a real name). The insert now work, but find_organizations() does not show an organization. I do see the new organization name when I use my admin token. Not sure if this is a permission issue or the fact that the bucket was created earlier than the name of the organization. I can't find any option for creating additional associations between the two.

P.s. the token only has read/write permissions on the related bucket.

johanmeijer commented 3 years ago

I am going to test it. It must be a authorization / permit issue. Strange that you can write/read to an organisation but not find it.

I perform a find organization while organisation is needed for the write indeed. Maybe it is not necessary to check it at initialisation time (with the risk the write will give an error).

johanmeijer commented 3 years ago

@a-dekker you are right it is a permission thing. Tokens with only read/write access are not allowed to find organizations.

I made some small changes and uploaded it to the grott_beta branche (version 2.3.1b).

During the initialization the organization test in now informational only (will no give an error). If during the write to the database an error aoccurs (e.g. because of the organization not found) Grott will give an error and stop processing.

a-dekker commented 3 years ago

Can't test real-time data processing at this time of the day, but it starts without error and only shows a warning it can't check the organization (the message "not authorisation" should be "no authorization" btw). Looks like we are close to full InfluxDB 2.0 support!

johanmeijer commented 3 years ago

Let see if it works tomorrow (I have it also active here).

Aahh not authorized or no authorization. I will change it with the next upload.

a-dekker commented 3 years ago

Just to be sure I also updated from InfluxDB v2.0.2 to the latest v2.0.3 (which was another challenge as they changed the packagename and some paths). But Grott seems to run okay and I get my data in InfluxDB as expected.

johanmeijer commented 3 years ago

I am glad it works :)

Influxdb V2 is still under development. I think each new release will bring new features and challenges (just like Grott)/. For now it works!

Again thanking you for testing and helping. If it runs okay for a couple of days I will promote the beta version to production.

johanmeijer commented 3 years ago

InfluxDB V1 and V1 support is now generic available (in 2.4.0 Master Branche). This issue can be closed.

a-dekker commented 3 years ago

I assume you mean V1 and V2? And 2.4.0 Master Branch is 2.4 and master?

johanmeijer commented 3 years ago

Yes V1 and V2 sorry. Version 2.4.0 is promoted to the master. The 2.4 and grott_beta (2.3.1b) branches will be decommissioned.

a-dekker commented 3 years ago

Thanks, then I will switch back to master again.

johanmeijer / grott

Store data locally / Influx DB support #29