Open walkingmug opened 8 months ago
Description: When trying to append a pandas dataframe of type "dense_vector" to an existing elastic index with the same field type, an error occurs.
Reproduction:
pip install elasticsearch eland pandas numpy
from elasticsearch import Elasticsearch import eland as ed import pandas as pd import numpy as np
client = Elasticsearch(HOST, timeout=120)
vector1 = np.random.rand(512) vector2 = np.random.rand(512) df_1 = pd.DataFrame({ 'vector_column': [vector1, vector2] })
vector3 = np.random.rand(512) vector4 = np.random.rand(512) df_2 = pd.DataFrame({ 'vector_column': [vector3, vector4] })
5. ✅ Upload first dataframe:
ed.pandas_to_eland( pd_df=df_1, es_client=client, es_dest_index='test-upload', es_if_exists="append", es_refresh=True, es_type_overrides={ "vector_column": { "type": "dense_vector", "dims": 512, "index": True, "similarity": "cosine" }, }, chunksize=100 )
6. ❌ Append second dataframe to first dataframe:
ed.pandas_to_eland( pd_df=df_2, es_client=client, es_dest_index='test-upload', es_if_exists="append", es_refresh=True, es_type_overrides={ "vector_column": { "type": "dense_vector", "dims": 512, "index": True, "similarity": "cosine" }, }, chunksize=100 )
Error:
TypeError Traceback (most recent call last) in <cell line: 2>() 1 # upload df_2 to elasticsearch ----> 2 ed.pandas_to_eland( 3 pd_df=df_2, 4 es_client=client, 5 es_dest_index='test-upload',
1 frames /usr/local/lib/python3.10/dist-packages/eland/field_mappings.py in verify_mapping_compatibility(ed_mapping, es_mapping, es_type_overrides) 919 key_type = es_type_overrides.get(key, key_def["type"]) 920 es_key_type = es_props[key]["type"] --> 921 if key_type != es_key_type and es_key_type not in ES_COMPATIBLE_TYPES.get( 922 key_type, () 923 ):
TypeError: unhashable type: 'dict'
Description: When trying to append a pandas dataframe of type "dense_vector" to an existing elastic index with the same field type, an error occurs.
Reproduction:
pip install elasticsearch eland pandas numpy
vector3 = np.random.rand(512) vector4 = np.random.rand(512) df_2 = pd.DataFrame({ 'vector_column': [vector3, vector4] })
upload df_1 to elasticsearch
ed.pandas_to_eland( pd_df=df_1, es_client=client, es_dest_index='test-upload', es_if_exists="append", es_refresh=True, es_type_overrides={ "vector_column": { "type": "dense_vector", "dims": 512, "index": True, "similarity": "cosine" }, }, chunksize=100 )
upload df_2 to elasticsearch
ed.pandas_to_eland( pd_df=df_2, es_client=client, es_dest_index='test-upload', es_if_exists="append", es_refresh=True, es_type_overrides={ "vector_column": { "type": "dense_vector", "dims": 512, "index": True, "similarity": "cosine" }, }, chunksize=100 )
TypeError Traceback (most recent call last) in <cell line: 2>()
1 # upload df_2 to elasticsearch
----> 2 ed.pandas_to_eland(
3 pd_df=df_2,
4 es_client=client,
5 es_dest_index='test-upload',
1 frames /usr/local/lib/python3.10/dist-packages/eland/field_mappings.py in verify_mapping_compatibility(ed_mapping, es_mapping, es_type_overrides) 919 key_type = es_type_overrides.get(key, key_def["type"]) 920 es_key_type = es_props[key]["type"] --> 921 if key_type != es_key_type and es_key_type not in ES_COMPATIBLE_TYPES.get( 922 key_type, () 923 ):
TypeError: unhashable type: 'dict'