orchestracities / ngsi-timeseries-api

QuantumLeap: a FIWARE Generic Enabler to support the usage of NGSIv2 (and NGSI-LD experimentally) data in time-series databases
https://quantumleap.rtfd.io/
MIT License
38 stars 49 forks source link

QL 0.8.3 can't insert values into existing table. #638

Open StWiemann opened 2 years ago

StWiemann commented 2 years ago

``I just tried to upgrade to QL 0.8.3 because of the new authentication-feature. Despite turning off caching and authentication I do get an error which is not present with QL 0.8.2 and the same config despite the version.

Orion:3.5.1 QL:0.8.3 CrateDB:4.6.7

Everything running in Kubernetes.

Config QL:

USE_GEOCODING | False
USE_FLASK | False
REDIS_PORT | 6379
REDIS_HOST | redis-master.redis.svc.cluster.local
QL_DEFAULT_DB | crate
LOGLEVEL | DEBUG
DEFAULT_CACHE_TTL | 120
CRATE_HOST | crate-external-service.crate.svc.cluster.local
CACHE_QUERIES | False

I have to admit that this is not much to go by, but maybe you guys recognize the fault. I didn't test it with a "fresh table", since I'd really would like to keep the existing ones. But if it helps you I could try to simulate some fresh data on monday.

I didn't really find anything helpful on these errors:

crate.client.exceptions.ProgrammingError: ColumnUnknownException[Column instanceid unknown] crate.client.exceptions.ProgrammingError: MapperParsingException[Mapping definition for [type] has unsupported parameters: [fields : {keyword={type=keyword, ignore_above=256}}]]

Very verbose log-message starting from message before the ERROR-message:

time=2022-02-25 05:08:44.907 | level=DEBUG | corr=96d94b80-965d-11ec-b802-4e21219882ac; cbnotif=1 | from=10.42.0.67 | srv=some-service | subserv=/ | op=_make_request | comp=urllib3.connectionpool | msg=http://crate-external-service.crate.svc.cluster.local:4200 "POST /_sql?error_trace=true HTTP/1.1" 404 5422 | payload=[{'id': 'b827eb312eec', 'type': 'heatpump', 'heat_ex_evaporator_temp_out': {'type': 'Number', 'value': '0.5', 'metadata': {'dateCreated': {'type': 'DateTime', 'value': '2021-08-05T10:45:38.295Z'}, 'dateModified': {'type': 'DateTime', 'value': '2022-02-25T17:08:44.875Z'}}}, 'time_index': '2022-02-25T17:08:44.875000+00:00'}] | thread=140476855417632  | process=9 
time=2022-02-25 05:08:44.908 | level=ERROR | corr=96d94b80-965d-11ec-b802-4e21219882ac; cbnotif=1 | from=10.42.0.67 | srv=some-service | subserv=/ | op=_insert_entity_rows | comp=translators.crate | msg=Failed to insert entities because of below error; translator will still try saving original JSON in "mtsome-service"."etheatpump".__original_ngsi_entity__ | payload=[{'id': 'b827eb312eec', 'type': 'heatpump', 'heat_ex_evaporator_temp_out': {'type': 'Number', 'value': '0.5', 'metadata': {'dateCreated': {'type': 'DateTime', 'value': '2021-08-05T10:45:38.295Z'}, 'dateModified': {'type': 'DateTime', 'value': '2022-02-25T17:08:44.875Z'}}}, 'time_index': '2022-02-25T17:08:44.875000+00:00'}] | thread=140476855417632  | process=9 
Traceback (most recent call last): 
  File "/src/ngsi-timeseries-api/src/translators/sql_translator.py", line 371, in _insert_entity_rows 
    res = self.cursor.executemany(stmt, batch) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/cursor.py", line 67, in executemany 
    self.execute(sql, bulk_parameters=seq_of_parameters) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/cursor.py", line 53, in execute 
    self._result = self.connection.client.sql(sql, parameters, 
  File "/usr/local/lib/python3.8/site-packages/crate/client/http.py", line 396, in sql 
    content = self._json_request('POST', self.path, data=data) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/http.py", line 523, in _json_request 
    _raise_for_status(response) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/http.py", line 206, in _raise_for_status 
    raise ProgrammingError(error.get('message', ''), 
crate.client.exceptions.ProgrammingError: ColumnUnknownException[Column instanceid unknown] 
time=2022-02-25 05:08:44.909 | level=DEBUG | corr=96d94b80-965d-11ec-b802-4e21219882ac; cbnotif=1 | from=10.42.0.67 | srv=some-service | subserv=/ | op=sql | comp=crate.client.http | msg=Sending request to /_sql?error_trace=true with payload: {"stmt": "insert into \"mtsome-service\".\"etheatpump\" (entity_id, entity_type, time_index, __original_ngsi_entity__) values (?, ?, ?, ?)", "bulk_args": [["b827eb312eec", "heatpump", "2022-02-25T17:08:44.909", {"data": "{\"id\": \"b827eb312eec\", \"type\": \"heatpump\", \"heat_ex_evaporator_temp_out\": {\"type\": \"Number\", \"value\": \"0.5\", \"metadata\": {\"dateCreated\": {\"type\": \"DateTime\", \"value\": \"2021-08-05T10:45:38.295Z\"}, \"dateModified\": {\"type\": \"DateTime\", \"value\": \"2022-02-25T17:08:44.875Z\"}}}, \"time_index\": \"2022-02-25T17:08:44.875000+00:00\"}", "failedBatchID": "9f715940befe4ef9a2a3dda388132ff5", "error": "ProgrammingError('ColumnUnknownException[Column instanceid unknown]')"}]]} | payload=[{'id': 'b827eb312eec', 'type': 'heatpump', 'heat_ex_evaporator_temp_out': {'type': 'Number', 'value': '0.5', 'metadata': {'dateCreated': {'type': 'DateTime', 'value': '2021-08-05T10:45:38.295Z'}, 'dateModified': {'type': 'DateTime', 'value': '2022-02-25T17:08:44.875Z'}}}, 'time_index': '2022-02-25T17:08:44.875000+00:00'}] | thread=140476855417632  | process=9 
time=2022-02-25 05:08:44.917 | level=DEBUG | corr=96d94b80-965d-11ec-b802-4e21219882ac; cbnotif=1 | from=10.42.0.67 | srv=some-service | subserv=/ | op=_make_request | comp=urllib3.connectionpool | msg=http://crate-external-service.crate.svc.cluster.local:4200 "POST /_sql?error_trace=true HTTP/1.1" 400 4058 | payload=[{'id': 'b827eb312eec', 'type': 'heatpump', 'heat_ex_evaporator_temp_out': {'type': 'Number', 'value': '0.5', 'metadata': {'dateCreated': {'type': 'DateTime', 'value': '2021-08-05T10:45:38.295Z'}, 'dateModified': {'type': 'DateTime', 'value': '2022-02-25T17:08:44.875Z'}}}, 'time_index': '2022-02-25T17:08:44.875000+00:00'}] | thread=140476855417632  | process=9 
time=2022-02-25 05:08:44.918 | level=DEBUG | corr=96d94b80-965d-11ec-b802-4e21219882ac; cbnotif=1 | from=10.42.0.67 | srv=some-service | subserv=/ | op=dispose | comp=translators.crate | msg=Translation completed | time=31.979 msec | payload=[{'id': 'b827eb312eec', 'type': 'heatpump', 'heat_ex_evaporator_temp_out': {'type': 'Number', 'value': '0.5', 'metadata': {'dateCreated': {'type': 'DateTime', 'value': '2021-08-05T10:45:38.295Z'}, 'dateModified': {'type': 'DateTime', 'value': '2022-02-25T17:08:44.875Z'}}}, 'time_index': '2022-02-25T17:08:44.875000+00:00'}] | thread=140476855417632  | process=9 
time=2022-02-25 05:08:44.918 | level=DEBUG | corr=96d94b80-965d-11ec-b802-4e21219882ac; cbnotif=1 | from=10.42.0.67 | srv=some-service | subserv=/ | op=read | comp=utils.cfgreader | msg=Env variable WQ_OFFLOAD_WORK not set, using default value of: False | payload=[{'id': 'b827eb312eec', 'type': 'heatpump', 'heat_ex_evaporator_temp_out': {'type': 'Number', 'value': '0.5', 'metadata': {'dateCreated': {'type': 'DateTime', 'value': '2021-08-05T10:45:38.295Z'}, 'dateModified': {'type': 'DateTime', 'value': '2022-02-25T17:08:44.875Z'}}}, 'time_index': '2022-02-25T17:08:44.875000+00:00'}] | thread=140476855417632  | process=9 
time=2022-02-25 05:08:44.918 | level=ERROR | corr=96d94b80-965d-11ec-b802-4e21219882ac; cbnotif=1 | from=10.42.0.67 | srv=some-service | subserv=/ | op=notify | comp=reporter.reporter | msg=Notification not processed or not updated: MapperParsingException[Mapping definition for [type] has unsupported parameters:  [fields : {keyword={type=keyword, ignore_above=256}}]] | payload=[{'id': 'b827eb312eec', 'type': 'heatpump', 'heat_ex_evaporator_temp_out': {'type': 'Number', 'value': '0.5', 'metadata': {'dateCreated': {'type': 'DateTime', 'value': '2021-08-05T10:45:38.295Z'}, 'dateModified': {'type': 'DateTime', 'value': '2022-02-25T17:08:44.875Z'}}}, 'time_index': '2022-02-25T17:08:44.875000+00:00'}] | thread=140476855417632  | process=9 
Traceback (most recent call last): 
  File "/src/ngsi-timeseries-api/src/translators/sql_translator.py", line 371, in _insert_entity_rows 
    res = self.cursor.executemany(stmt, batch) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/cursor.py", line 67, in executemany 
    self.execute(sql, bulk_parameters=seq_of_parameters) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/cursor.py", line 53, in execute 
    self._result = self.connection.client.sql(sql, parameters, 
  File "/usr/local/lib/python3.8/site-packages/crate/client/http.py", line 396, in sql 
    content = self._json_request('POST', self.path, data=data) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/http.py", line 523, in _json_request 
    _raise_for_status(response) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/http.py", line 206, in _raise_for_status 
    raise ProgrammingError(error.get('message', ''), 
crate.client.exceptions.ProgrammingError: ColumnUnknownException[Column instanceid unknown] 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
  File "/src/ngsi-timeseries-api/src/reporter/reporter.py", line 193, in notify 
    InsertAction(fiware_s(), fiware_sp(), fiware_correlator(), payload) \ 
  File "/src/ngsi-timeseries-api/src/wq/core/task.py", line 253, in enqueue 
    run_action(self) 
  File "/src/ngsi-timeseries-api/src/wq/core/task.py", line 305, in run_action 
    target.run() 
  File "/src/ngsi-timeseries-api/src/wq/ql/notify.py", line 87, in run 
    self._handle_exception(svc, e) 
  File "/src/ngsi-timeseries-api/src/wq/ql/notify.py", line 92, in _handle_exception 
    raise e 
  File "/src/ngsi-timeseries-api/src/wq/ql/notify.py", line 85, in run 
    trans.insert(data.payload, svc, svc_path) 
  File "/src/ngsi-timeseries-api/src/translators/sql_translator.py", line 224, in insert 
    res = self._insert_entities_of_type(et, 
  File "/src/ngsi-timeseries-api/src/translators/sql_translator.py", line 358, in _insert_entities_of_type 
    self._insert_entity_rows(table_name, col_names, entries, entities) 
  File "/src/ngsi-timeseries-api/src/translators/sql_translator.py", line 398, in _insert_entity_rows 
    self._insert_original_entities_in_failed_batch( 
  File "/src/ngsi-timeseries-api/src/translators/sql_translator.py", line 454, in _insert_original_entities_in_failed_batch 
    self.cursor.executemany(stmt, rows) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/cursor.py", line 67, in executemany 
    self.execute(sql, bulk_parameters=seq_of_parameters) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/cursor.py", line 53, in execute 
    self._result = self.connection.client.sql(sql, parameters, 
  File "/usr/local/lib/python3.8/site-packages/crate/client/http.py", line 396, in sql 
    content = self._json_request('POST', self.path, data=data) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/http.py", line 523, in _json_request 
    _raise_for_status(response) 
  File "/usr/local/lib/python3.8/site-packages/crate/client/http.py", line 206, in _raise_for_status 
    raise ProgrammingError(error.get('message', ''), 
crate.client.exceptions.ProgrammingError: MapperParsingException[Mapping definition for [type] has unsupported parameters:  [fields : {keyword={type=keyword, ignore_above=256}}]] 
c0c0n3 commented 2 years ago

@StWiemann thanks for reporting this! It looks like it could be a backward compat issue introduced by #565. Can you confirm your table (mtsome-service.etheatpump?) does not have a column called instanceid?

chicco785 commented 2 years ago

did you get the error while querying or doing inserts?

in principle doing an insert should automatically add the column. if that's not the case, that's the problem here, and worst of all, this is not captured by the backward compatibility tests :/

chicco785 commented 2 years ago

the issue is during inserts. most probably in the crate version you created the table, the default policy for columns is "strict", and this makes it bomb

can you try as temp fix to alter the table from cratedb? something like. ALTER TABLE mtsome-service.etheatpump SET ( column_policy = 'dynamic' );

StWiemann commented 2 years ago

@StWiemann thanks for reporting this! It looks like it could be a backward compat issue introduced by #565. Can you confirm your table (mtsome-service.etheatpump?) does not have a column called instanceid?

You are absolutely right. I was wondering what instanceid was refering to, but that is the problem. It is not there.

the issue is during inserts. most probably in the crate version you created the table, the default policy for columns is "strict", and this makes it bomb

can you try as temp fix to alter the table from cratedb? something like. ALTER TABLE mtsome-service.etheatpump SET ( column_policy = 'dynamic' );

Yes, while isnerts. Thank you! I will give your fix a try right now and let you know.

StWiemann commented 2 years ago

The ALTER TABLE command gives me a "ALTER OK, -1 rows affected (0.055 seconds)" response.

QL still throwing the same error. And when I try to insert some dummy column through the CLI I also get the above ERROR:

insert into "mtsome-service"."etheatpump" (some_thing) values('some-string');

`Error!

MapperParsingException[Mapping definition for [some_thing] has unsupported parameters: [fields : {keyword={type=keyword, ignore_above=256}}]]`

Querying the column_policy like this: SELECT table_name, column_policy FROM information_schema.tables WHERE table_name = 'etheatpump'; Shows that the column_policy was successfully changed to dynamic though.

EDIT: When I tried an insert with an integer on the other hand, it successfully did the insert: insert into "mtsome-service"."etheatpump" (some_thing) values(1);

Might this be a CrateDB-Bug?

chicco785 commented 2 years ago

probably the table column_policy can't be changed :/ easy quick fix, manually add the column:

alter table "mtsome-service"."etheatpump" add column instanceid text;

we will find a way to fix this issue, but meanwhile, i guess that's fastest patch possible

StWiemann commented 2 years ago

probably the table column_policy can't be changed :/ easy quick fix, manually add the column:

alter table "mtsome-service"."etheatpump" add column instanceid text;

we will find a way to fix this issue, but meanwhile, i guess that's fastest patch possible

This worked! Thank you! Still strange behaviour that the insert with Int worked, but doesn't with text 🤷

Siedlerchr commented 2 years ago

I had the same problem and the solution with alter table "mtsome-service"."etheatpump" add column instanceid text; worked for me as well. (Running with docker)

tstorek commented 2 years ago

Hi, is there any progress on this? I have a larger setup multi-tenant setup where I have lots of existing tables I am not very eager to do the fix manually. I initially thought that my newly introduced WQ would cause the error because it gives very little debug info about itself.

c0c0n3 commented 2 years ago

hi @tstorek, we were thinking of implementing some sort of data migration tool, but haven't got around to doing it yet. @chicco785 opened #639 so we're able to at least reproduce, but as it turns out automated data migration will take some time---probably something we'll tackle in 2023.

For the time being, the only solution I can offer you is to write yourself a SQL data migration script with one alter table statement for each table you have as suggested in this thread...so sorry for the inconvenience!!

Greenstreet123 commented 1 year ago

Hi, we saw this bug using Timescale DB as well. Are there any updates regarding solving this issue?

c0c0n3 commented 1 year ago

Hi @Greenstreet123 :-)

No unfortunately we haven't worked on this yet. The best solution I can offer at the moment is the one suggested above--i.e. manually run an alter table command to add the missing column.

Greenstreet123 commented 1 year ago

Hi, thanks for quick reply. Do you have a proposed solution already just not had time to implement?

c0c0n3 commented 1 year ago

@Greenstreet123

Do you have a proposed solution already just not had time to implement?

We sketched out a solution involving a data migration tool that could be used in the future too for similar scenarios. It could be similar to what we already do with Timescale:

where the init container lets you run SQL scripts on start-up. But yes, we've had no time to go past the idea phase and come up with an implementation plan.