Open Afrojam opened 5 years ago
At least two issues might exlpain this behavior:
You can try SHOW FIELD KEYS
and SHOW TAG KEYS
to investigate.
If there is no bug in your code, can you build a minimal working example starting from a fresh database exhibiting the behavior?
You are right, I have noticed that some "columns" appear on both as a FIELD and as a TAG, and this is what causes to have the duplicates. I'm going to drop the series and create them again, taking care of that.
Well, I'm not getting if I have a bug when transfering data into influxdb. I call write_points() from InfluxDBClient() using a list where each item is like: {'fields': {'occupancy_lengths_square_sum': x, 'occupancy_lengths_sum': x, 'std_occupancy_length_min': x, 'total_disconnected_hour': x, 'total_free_events': x, 'total_free_hour': x, 'total_full_events': x, 'total_full_hour': x, 'total_occupancy_events': x, 'total_occupancy_hour': x}, 'measurement': 'city', 'tags': {'Day': x, 'DayOfWeek': x, 'Hour': x, 'Month': x, 'Season': x, 'Year': x, 'latitude': x, 'longitude': x, 'occupancy_length_0_5_min': x, 'occupancy_length_10_30_min': x, 'occupancy_length_1_2_hour': x, 'occupancy_length_1_7_day': x, 'occupancy_length_4_8_hour': x, 'occupancy_length_5_10_min': x, 'occupancy_length_8_24_hour': x, 'sector_id': 'x', 'sector_name': 'x', 'sectortype_id': 'x', 'sectortype_name': 'x', 'total_occupancy_intervals': x}, 'time': datetime.datetime(xxxx, x, x, x, x)} 'Written 40080 points into influxdb'
So I don't get why I got then all the data in influx as fieldKey: From show field keys
name: city fieldKey fieldType
Day string DayOfWeek string Hour string Month string Season string Year string latitude string longitude string occupancy_length_0_5_min string occupancy_length_10_30_min string occupancy_length_1_2_hour string occupancy_length_1_7_day string occupancy_length_4_8_hour string occupancy_length_5_10_min string occupancy_length_8_24_hour string occupancy_lengths_square_sum float occupancy_lengths_sum float sector_id string sector_name string sectortype_id string sectortype_name string std_occupancy_length_min float total_disconnected_hour float total_free_events float total_free_hour float total_full_events float total_full_hour float total_occupancy_events float total_occupancy_hour float total_occupancy_intervals string
From show tag keys
Day DayOfWeek Hour Month Season Year latitude longitude occupancy_length_0_5_min occupancy_length_10_30_min occupancy_length_1_2_hour occupancy_length_1_7_day occupancy_length_4_8_hour occupancy_length_5_10_min occupancy_length_8_24_hour sector_id sector_name sectortype_id sectortype_name total_occupancy_intervals
Tags as fields might have been inserted with your DataFrameClient code. Try to reproduce from a fresh db with only one field and one tag.
@clslgrnc not using DataFrameClient here, using the InfluxDBClient and converting the points from the result set to a DataFrame. I have tried the following: Same data base, new measurement, 1 tag and 1 field. {'measurement': 'test', 'tags': {'Hour': 19}, 'time': datetime.datetime(2018, 8, 31, 19, 0), 'fields': {'total_occupancy_hour': 0}}
This worked perfect: name: test fieldKey fieldType
total_occupancy_hour integer
Hour
Also, I have tried in a new database the original insert (so only change is the database)
show field keys name: fp_kingdom_tower_historic fieldKey fieldType
occupancy_lengths_square_sum float occupancy_lengths_sum float std_occupancy_length_min float total_disconnected_hour float total_free_events float total_free_hour float total_full_events float total_full_hour float total_occupancy_events float total_occupancy_hour float show tag keys name: fp_kingdom_tower_historic tagKey
Day DayOfWeek Hour Month Season Year latitude longitude occupancy_length_0_5_min occupancy_length_10_30_min occupancy_length_1_2_hour occupancy_length_1_7_day occupancy_length_4_8_hour occupancy_length_5_10_min occupancy_length_8_24_hour sector_id sector_name sectortype_id sectortype_name total_occupancy_intervals
This worked PERFECT
Finally, using another measurement name like "city2", I have no problems with the tags and the fields...
This is strange, because in the original database I have other "cities" (measurements) and I don't have the strange behaviour that I have reported. Indeed, this behaviour only happens in the last city that I have created in the data base. The measurement is created when inserting the data. Any idea?
I have tried the following approach: 1) Drop the conflictive measurement DROP MEASUREMENT city 2) Insert all the data into a new measurement called placeholder. Here we can see that all is fine: name: placeholder fieldKey fieldType
occupancy_lengths_square_sum float occupancy_lengths_sum float std_occupancy_length_min float total_disconnected_hour float total_free_events float total_free_hour float total_full_events float total_full_hour float total_occupancy_events float total_occupancy_hour float
3) Select into city from placeholder group by But the result is not fine: name: city fieldKey fieldType
Day string DayOfWeek string Hour string Month string Season string Year string latitude string longitude string occupancy_length_0_5_min string occupancy_length_10_30_min string occupancy_length_1_2_hour string occupancy_length_1_7_day string occupancy_length_4_8_hour string occupancy_length_5_10_min string occupancy_length_8_24_hour string occupancy_lengths_square_sum float occupancy_lengths_sum float sector_id string sector_name string sectortype_id string sectortype_name string std_occupancy_length_min float total_disconnected_hour float total_free_events float total_free_hour float total_full_events float total_full_hour float total_occupancy_events float total_occupancy_hour float total_occupancy_intervals string
So, the problem is the measurement name and some type of cache...
Hello, because the error about Already tz-aware, use tz_convert to convert. when using DataFrameClient, I have changed my code to use InfluxDBClient, what I have found is that tags yet duplicated.
Code Sample, a copy-pastable example if possible
Problem description
[this should explain why the current behaviour is a problem and why the expected output is a better solution.]
The results from above query are: Index(['Day', 'DayOfWeek', 'DayOfWeek_1', 'Day_1', 'Hour', 'Hour_1', 'Month', 'Month_1', 'Season', 'Season_1', 'Year', 'Year_1', 'latitude', 'latitude_1', 'longitude', 'longitude_1', 'occupancy_length_0_5_min', 'occupancy_length_0_5_min_1', 'occupancy_length_10_30_min', 'occupancy_length_10_30_min_1', 'occupancy_length_1_2_hour', 'occupancy_length_1_2_hour_1', 'occupancy_length_1_7_day', 'occupancy_length_1_7_day_1', 'occupancy_length_4_8_hour'...
The columns got duplicated, and those columns are not in the influx measurement (the _1 ones). Also, it is strange because the columns with the suffix _1 have "good" values while the ones without _1 all values are None. Another thing to notice is that this happens when I'm using the whole measurement data, while when using only data from past 1 day or 2 days this does not happens.
Output
Expected Output
Index(['Day', 'DayOfWeek', 'Hour', 'Month', 'Season', 'Year', 'latitude', 'longitude', 'occupancy_length_0_5_min', 'occupancy_length_10_30_min', 'occupancy_length_1_2_hour', 'occupancy_length_1_7_day', 'occupancy_length_4_8_hour',
Output of influxdb.version``
'5.2.1'
Showing my tag keys in influx