Closed wldbest closed 1 month ago
Hi, you may check the issue here: https://discord.com/channels/933071162680958986/1270632504180867144
And here is the query in Turso.
CREATE TABLE movies (title TEXT, year INT, embedding F32_BLOB (7));
INSERT INTO movies (title, year, embedding) VALUES ('Napoleon', 2023, vector ('[-0.007120034, -0.01406258, -0.007229258, -0.006000489, -0.02479383, -0.021790171, 0.035770833]')), ('Black Hawk Down', 2001, vector ('[-0.028583912, -0.040733904, -0.010124992, 0.007880679, -0.020103775, -0.034768563, 0.014942587]')), ('Gladiator', 2000, vector ('[0.017125275, 0.008342252, -0.0076681306, -0.019173568, 0.034639467, -0.022855308, 0.028261242]')), ('Blade Runner', 1982, vector ('[-0.007572184, 0.006349286, -0.003955667, 0.0017495593, 0.014570421, 0.0131681645, -0.0057883835]'));
--- Query with Gladiator Vector SELECT title, year, vector_extract(embedding), vector_distance_cos(embedding, '[0.017125275, 0.008342252, -0.0076681306, -0.019173568, 0.034639467, -0.022855308, 0.028261242]') as cos_distance FROM movies ORDER BY cos_distance ASC;
Hi @wldbest, I believe Dataflare displays the values correctly and keeps them intact, this has also been verified in other clients that support libSQL.
[!NOTE] Dataflare has lost some precision when converting float64, which will be fixed in the next version.
Do you want to display 1.753244305291446e-8
as 0.00000001753244305291446
?
Oh, I see. I think I just forgot to check the entire value.
The problem was maybe the windows crop. It would be better background highlighting If the value diaplayed in scientific notation.
Thanks for your quick reply.
Describe the bug
When performing operations with vectors, sometimes very small values are represented in exponential notation. Currently, Dataflare does not recognize these values and outputs them as large values.
In the vector cos_distance, self is either zero or a very small value.
Gladiator|2000|[0.0171253,0.00834225,-0.00766813,-0.0191736,0.0346395,-0.0228553,0.0282612]|1.75324430529145e-08 Napoleon|2023|[-0.00712003,-0.0140626,-0.00722926,-0.00600049,-0.0247938,-0.0217902,0.0357708]|0.805753350257874
I get these values when testing on Turso, 1.75324430529145e-08, which Dataflare understands and outputs as 1.75.
This is why I can't trust the results, so please fix it. It would be nice if it handles scientific notation properly or just outputs the result string as it is. (libsql/Turso)
Platform and Database