Closed frg01 closed 10 months ago
Hi @frg01. You question is a bit open to give a specific answer, but I suggest that you take a look at the code for elasticsearch https://github.com/erikbern/ann-benchmarks/blob/main/ann_benchmarks/algorithms/elasticsearch/module.py. You will have to write all the code to setup the connection in your own module.py
file, setting up the connection in the init
, building the index in fit
and search for vectors in query
. It's usually easiest to write the code in local mode without the docker interface. Use python run.py --local ...
instead of python run.py ...
.
Hope that helps.
Thank you very much for your guidance.
I have another question. I deployed the Ann-Benchmarks project on the Ubantu system. When processing the fit() function, I don’t know how to handle the passed X (numpy.ndarray type) parameter. When I connect to my database and insert data individually( cur.execute()), the project can run successfully, but when I insert data in batches(cur.executemany()), errors always occur. I'm really about to collapse. I really don't know how to use the numpy package.
import psycopg2
text_data = []
for i, x in enumerate(X):
c = x.tolist()
em = json.dumps(c)
ems = "\'" + em + "\'"
id = json.dumps(i)
try:
cur.execute(f"insert into items (id,embedding) values ({id},{ems})")
conn.commit()
except Exception as e:
print("Insert failed: ", e)
res = []
sql = "insert into items (id,embedding) values ( %s , %s )"
for i, x in enumerate(X):
id = json.dumps(i)
c = x.tolist()
em = json.dumps(c)
ems = "\'" + em + "\'"
temp = (id,x)
res.append(temp)
if int(i) % 4 == 0 and int(i) >= 3:
try:
cur.executemany(sql,res)
except Exception as e:
print(e, "------")
finally:
conn.commit()
res = []
cur.executemany(sql,res)
conn.commit()
I try to use many function ,but don't know how to slove it. For example : numpy.ndarray:tostring() tolist() json:dumps().... How do I do this? Help me plzz.
I have a question. I want to ask someone who can answer me. I want to deploy the database locally on Linux and use ann-benchmarks to test my self-developed vector database. But I don’t know how to connect ann-benmarks to my relational vector database. , where should I configure it in the python code? Maybe I should write some code to connect, right? I'm new to databases, so my question may seem a bit silly, but I'm hoping someone can shed some light on my confusion, thank you!