Open johnqiuwan opened 1 month ago
@johnqiuwan Thanks for reaching out! Could you provide more details on the problem and some reproducable code example?
Thank you for the quick reply!
Sample code to process the semantic search
<?php
namespace App\Services;
use RedisVentures\RedisVl\Vectorizer\Factory;
use RedisVentures\RedisVl\VectorHelper;
use RedisVentures\RedisVl\Query\VectorQuery;
use RedisVentures\RedisVl\Index\SearchIndex;
use Predis\Client;
class VectorQueryService
{
protected $factory;
protected $vectorProvider;
protected $vectorHelper;
protected $index;
public function __construct()
{
//
$this->factory = new Factory();
$this->vectorProvider =
$this->factory->createVectorizer('openai', env('TEXT_EMBEDDING_MODEL'));
$this->vectorHelper = new VectorHelper();
$this->index = new SearchIndex(new Client(), $this->schema());
$this->index->create();
}
private function schema()
{
$schema = [
'index' => [
'name' => 'idx:product',
'prefix' => 'laravel_hemes_database_product_by_id:',
'storage_type' => 'json',
],
'fields' => [
'id' => [
// 'path' => '$.id',
'type' => 'numeric',
],
'description' => [
// 'path' => '$.description',
'type' => 'text',
],
'vector' => [
// 'path' => '$.description_embeddings',
'type' => 'vector',
'dims' => 1536,
'datatype' => 'float32',
'algorithm' => 'flat',
'distance_metric' => 'cosine'
],
'image' => [
'type' => 'tag'
],
'slug' => [
// 'path' => '$.slug',
'type' => 'tag',
],
'product_name_text' => [
// 'path' => '$.product_name',
'type' => 'text',
],
'price' => [
// 'path' => '$.price',
'type' => 'numeric',
// 'sortable' => true,
],
'current_price' => [
// 'path' => '$.price',
'type' => 'numeric',
// 'sortable' => true,
],
'created' => [
// 'path' => '$.created_at',
'type' => 'numeric',
// 'sortable' => true,
],
'variant_options' => [
// 'path' => '$.variant_options',
'type' => 'tag',
],
'model' => [
// 'path' => '$.product_specifications.model',
'type' => 'text',
],
'category' => [
//'path' => '$.product_specifications.category',
'type' => 'tag',
],
'manufactory' => [
// 'path' => '$.product_specifications.manufactory[*]',
'type' => 'tag',
],
],
];
return $schema;
}
public function embed($text)
{
$embedding = $this->vectorProvider->embed($text);
$embedding = $embedding['data'][0]['embedding'];
if (!is_array($embedding)) {
$embedding = [$embedding];
}
return $embedding;
}
public function query($embedding)
{
// $embedding = [VectorHelper::toBytes($embedding)];
$query = new VectorQuery($embedding, 'vector', ['id', 'description', 'product_name_text', 'variant_options', 'model', 'category', 'manufactory', 'price', 'current_price', 'slug', 'image'], 10, true, 3);
return $this->index->query($query);
}
public function processResult($result)
{
return collect($result)->map(function ($product, $key) {
return collect($product)->transform(function ($value) {
return json_decode($value, true);
});
})->values()->toArray();
}
public function resultDto($result)
{
return collect($result)->map(function ($product, $key) {
$product['id'] = $product['id'][0];
$product['description'] = $product['description'][0];
$product['product_name'] = $product['product_name_text'][0];
$product['slug'] = $product['slug'][0];
return $product;
})->toArray();
}
}
Context:
Already checked:
Problem: All the items returned will have a score of 0
Expected behavior the score should not all 0
Versions:
Additional context If the vector value is updated in the redisjson, the search result will update accordingly. It seems the search is working but just all the scores are 0.
Does any updates on this @vladvildanov , thank you
@johnqiuwan By default Redis calculates scores based on terms frequency and it's occurrences in the document. Could you try to use other scorers available by default in Redis? It feels like it's something related to server-side
https://redis.io/docs/latest/develop/interact/search-and-query/advanced-concepts/scoring/
Thank you for the updates! I have looked the doc from the link you gave, but I am still not make sure why all return items has score of 0. It seems to me is a bug, but the items will update if the embedding updated. This is kind of strange to me , lol
I am appreciated your time and your amazing work.
Thank you so much! Let me know if you find something or feel free to contribute 👌
I have followed the doc to do the Realtime search query. The setup is smooth, and the query has no error.
However, I noticed that all the result of the semantic search query items will have a score of 0
Is that normal ?