With HNSWVectorStorage, the default example works with global search but local search was not tested (which is actually where the vector db querying happens). Local search did not work with HNSWVectorStorage because meta_fields were not correctly stored, causing the query results to return no entity_name.
Updates
[x] Fixed the upsert() function by including the meta_fields as part of metadata which is tracked along with the labels and distances returned from query.
[x] Small performance improvements with generators for upsert() and query().
[x] Minor updates to unit tests.
[x] Updated example to use deepseek-chat model.
Benchmark HNSWVectorStorage vs NanoVectorDBStorage:
n=100_000, d=1024
> python benchmarks/hnsw_vs_nano_vector_storage.py
Running NanoVectorDB benchmark...
INFO:nano-graphrag:Creating working directory ./nano_graphrag_cache_benchmark_hnsw_vs_nano_vector_storage
INFO:nano-graphrag:Load KV full_docs with 0 data
INFO:nano-graphrag:Load KV text_chunks with 0 data
INFO:nano-graphrag:Load KV llm_response_cache with 0 data
INFO:nano-graphrag:Load KV community_reports with 0 data
INFO:nano-vectordb:Init {'embedding_dim': 1024, 'metric': 'cosine', 'storage_file': './nano_graphrag_cache_benchmark_hnsw_vs_nano_vector_storage/vdb_entities.json'} 0 data
INFO:nano-vectordb:Init {'embedding_dim': 1024, 'metric': 'cosine', 'storage_file': './nano_graphrag_cache_benchmark_hnsw_vs_nano_vector_storage/vdb_benchmark_nano.json'} 0 data
Benchmarking nano...
nano Benchmark: 0%| | 0/100000 [00:00<?, ?it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 100101it [00:04, 22875.64it/s]
nano - Insert: 1.41s, Save: 1.71s, Avg Query: 0.0125s
Running HNSWVectorStorage benchmark...
INFO:nano-graphrag:Load KV full_docs with 0 data
INFO:nano-graphrag:Load KV text_chunks with 0 data
INFO:nano-graphrag:Load KV llm_response_cache with 0 data
INFO:nano-graphrag:Load KV community_reports with 0 data
INFO:nano-vectordb:Init {'embedding_dim': 1024, 'metric': 'cosine', 'storage_file': './nano_graphrag_cache_benchmark_hnsw_vs_nano_vector_storage/vdb_entities.json'} 0 data
INFO:nano-graphrag:Created new index for benchmark_hnsw
Benchmarking hnsw...
hnsw Benchmark: 0%| | 0/100000 [00:00<?, ?it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 100101it [01:23, 1197.02it/s]
hnsw - Insert: 82.94s, Save: 0.50s, Avg Query: 0.0018s
Benchmark Results:
NanoVectorDB - Insert: 1.41s, Save: 1.71s, Avg Query: 0.0125s
HNSWVectorStorage - Insert: 82.94s, Save: 0.50s, Avg Query: 0.0018s
n=1_000_000, d=1024
> python benchmarks/hnsw_vs_nano_vector_storage.py
Running NanoVectorDB benchmark...
INFO:nano-graphrag:Creating working directory ./nano_graphrag_cache_benchmark_hnsw_vs_nano_vector_storage
INFO:nano-graphrag:Load KV full_docs with 0 data
INFO:nano-graphrag:Load KV text_chunks with 0 data
INFO:nano-graphrag:Load KV llm_response_cache with 0 data
INFO:nano-graphrag:Load KV community_reports with 0 data
INFO:nano-vectordb:Init {'embedding_dim': 1024, 'metric': 'cosine', 'storage_file': './nano_graphrag_cache_benchmark_hnsw_vs_nano_vector_storage/vdb_entities.json'} 0 data
INFO:nano-vectordb:Init {'embedding_dim': 1024, 'metric': 'cosine', 'storage_file': './nano_graphrag_cache_benchmark_hnsw_vs_nano_vector_storage/vdb_benchmark_nano.json'} 0 data
Benchmarking nano...
nano Benchmark: 0%| | 0/1000000 [00:00<?, ?it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 10%|█████████▍ | 100000/1000000 [00:01<00:13, 65004.60it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 20%|██████████████████▊ | 200000/1000000 [00:02<00:11, 67094.28it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 30%|████████████████████████████▏ | 300000/1000000 [00:04<00:10, 66806.48it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 40%|█████████████████████████████████████▌ | 400000/1000000 [00:06<00:09, 64556.41it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 50%|███████████████████████████████████████████████ | 500000/1000000 [00:07<00:08, 61842.55it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 60%|████████████████████████████████████████████████████████▍ | 600000/1000000 [00:09<00:06, 59906.38it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 70%|█████████████████████████████████████████████████████████████████▊ | 700000/1000000 [00:11<00:05, 58056.89it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 80%|███████████████████████████████████████████████████████████████████████████▏ | 800000/1000000 [00:13<00:03, 55523.96it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 90%|████████████████████████████████████████████████████████████████████████████████████▌ | 900000/1000000 [00:15<00:01, 52749.15it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_nano
nano Benchmark: 1000101it [01:04, 15412.94it/s]
nano - Insert: 18.19s, Save: 33.64s, Avg Query: 0.1306s
Running HNSWVectorStorage benchmark...
INFO:nano-graphrag:Load KV full_docs with 0 data
INFO:nano-graphrag:Load KV text_chunks with 0 data
INFO:nano-graphrag:Load KV llm_response_cache with 0 data
INFO:nano-graphrag:Load KV community_reports with 0 data
INFO:nano-vectordb:Init {'embedding_dim': 1024, 'metric': 'cosine', 'storage_file': './nano_graphrag_cache_benchmark_hnsw_vs_nano_vector_storage/vdb_entities.json'} 0 data
INFO:nano-graphrag:Created new index for benchmark_hnsw
Benchmarking hnsw...
hnsw Benchmark: 0%| | 0/1000000 [00:00<?, ?it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 10%|█████████▌ | 100000/1000000 [01:23<12:33, 1194.52it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 20%|███████████████████ | 200000/1000000 [02:55<11:45, 1133.51it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 30%|████████████████████████████▌ | 300000/1000000 [04:28<10:33, 1104.56it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 40%|██████████████████████████████████████ | 400000/1000000 [06:01<09:08, 1094.29it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 50%|███████████████████████████████████████████████▌ | 500000/1000000 [07:34<07:40, 1086.34it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 60%|█████████████████████████████████████████████████████████ | 600000/1000000 [09:09<06:12, 1074.65it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 70%|██████████████████████████████████████████████████████████████████▌ | 700000/1000000 [10:46<04:43, 1058.71it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 80%|████████████████████████████████████████████████████████████████████████████ | 800000/1000000 [12:24<03:11, 1045.96it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 90%|█████████████████████████████████████████████████████████████████████████████████████▌ | 900000/1000000 [14:08<01:38, 1017.91it/s]INFO:nano-graphrag:Inserting 100000 vectors to benchmark_hnsw
hnsw Benchmark: 1000101it [15:54, 1047.85it/s]
hnsw - Insert: 950.02s, Save: 4.21s, Avg Query: 0.0020s
Benchmark Results:
NanoVectorDB - Insert: 18.19s, Save: 33.64s, Avg Query: 0.1306s
HNSWVectorStorage - Insert: 950.02s, Save: 4.21s, Avg Query: 0.0020s
DEBUG:nano-graphrag:GraphRAG init with param:
working_dir = ./nano_graphrag_cache_using_hnsw_as_vectorDB,
enable_local = True,
chunk_token_size = 1200,
chunk_overlap_token_size = 100,
tiktoken_model_name = gpt-4o,
entity_extract_max_gleaning = 1,
entity_summary_to_max_tokens = 500,
graph_cluster_algorithm = leiden,
max_graph_cluster_size = 10,
graph_cluster_seed = 3735928559,
node_embedding_algorithm = node2vec,
node2vec_params = {'dimensions': 1536, 'num_walks': 10, 'walk_length': 40, 'window_size': 2, 'iterations': 3, 'random_seed': 3},
special_community_report_llm_kwargs = {'response_format': {'type': 'json_object'}},
embedding_func = {'embedding_dim': 1536, 'max_token_size': 8192, 'func': <function openai_embedding at 0x114465e10>},
embedding_batch_num = 32,
embedding_func_max_async = 16,
best_model_func = <function gpt_4o_mini_complete at 0x114465a20>,
best_model_max_token_size = 8196,
best_model_max_async = 4,
cheap_model_func = <function gpt_4o_mini_complete at 0x114465a20>,
cheap_model_max_token_size = 8196,
cheap_model_max_async = 4,
key_string_value_json_storage_cls = <class 'nano_graphrag._storage.JsonKVStorage'>,
vector_db_storage_cls = <class 'nano_graphrag._storage.HNSWVectorStorage'>,
vector_db_storage_cls_kwargs = {'max_elements': 1000000, 'ef_search': 200, 'M': 50},
graph_storage_cls = <class 'nano_graphrag._storage.NetworkXStorage'>,
enable_llm_cache = True,
addon_params = {},
convert_response_to_json_func = <function convert_response_to_json at 0x113e89fc0>
INFO:nano-graphrag:Load KV full_docs with 1 data
INFO:nano-graphrag:Load KV text_chunks with 42 data
INFO:nano-graphrag:Load KV llm_response_cache with 127 data
INFO:nano-graphrag:Load KV community_reports with 42 data
INFO:nano-graphrag:Loaded graph from ./nano_graphrag_cache_using_hnsw_as_vectorDB/graph_chunk_entity_relation.graphml with 376 nodes, 365 edges
INFO:nano-graphrag:Loaded existing index for entities with 375 elements
INFO:nano-graphrag:Revtrieved 42 communities
Global Search
INFO:nano-graphrag:Grouping to 2 groups for global search
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
# Key Themes in the Story
The narrative encapsulates several profound themes, primarily revolving around transformation, redemption, and social consciousness. The evolution of Ebenezer Scrooge serves as a core element that brings these themes to light. Below is an analysis of the foremost themes highlighted in the reports from multiple analysts.
## Transformation and Redemption
The most significant theme of the story is that of transformation, prominently represented through Scrooge's journey from a miserly figure to one imbued with generosity, kindness, and the true spirit of Christmas. This transformation may be perceived as a powerful depiction of personal growth and moral awakening. Scrooge’s reflective journey allows him to grasp the consequences of his past behaviors and inspires a commitment to change, demonstrating that it is never too late to seek redemption and amend one’s ways.
## Human Connection and Social Responsibility
Another crucial theme is the importance of human connection and community. Scrooge's relationships with his family, employees, and the vulnerable—particularly the Cratchit family—underscore the necessity of empathy and support. Initially indifferent towards the plight of the poor, Scrooge's character shift emphasizes the significance of social responsibility and charity, especially during the festive season. His interactions with those in need illustrate his journey toward recognizing the value of community support.
## Family and Togetherness
The narrative further explores the theme of family and unity. The dynamics within the Cratchit family, particularly through characters like Bob and Tiny Tim, embody the essence of togetherness and love, contrasting sharply with Scrooge's solitary existence. This theme reinforces the idea that family interactions and shared celebrations are integral to the joyful spirit of Christmas.
## Wealth, Poverty, and Social Commentary
The story also dives into the contrasting realities of wealth and poverty. Scrooge’s initial indifference to the struggles of the less fortunate alerts the reader to prevalent social inequalities and disconnections during the festive season. The disparities depicted, especially through the lives of the Cratchit family compared to wealthier characters, serve as a social commentary that challenges readers to reflect on their own societal responsibilities.
## The Christmas Spirit
Lastly, the theme of festive spirit and joy permeates the narrative, illustrated through diverse characters' involvement in holiday activities. This element not only amplifies the celebratory atmosphere of the story but also serves as a backdrop against which personal transformations occur, emphasizing the potential impact of collective joy on individual change.
## Conclusion
Overall, the themes of transformation, redemption, human connection, social responsibility, family, and the festive spirit coalesce to create a rich tapestry that prompts reflection on personal morals and societal obligations. This story serves as a reminder that genuine change is attainable, and through empathy and community support, individuals may foster a nurturing environment for themselves and others.
Local Search
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:nano-graphrag:Using 20 entites, 4 communities, 138 relations, 3 text units
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests"
INFO:openai._base_client:Retrying request to /chat/completions in 20.000000 seconds
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
# Top Themes in "A Christmas Carol"
Charles Dickens' "A Christmas Carol" is a timeless tale that explores several profound themes, intertwining morality with the spirit of Christmas. Below, we delve into some of the story's central themes, illustrating how they shape the characters and the moral message of the narrative.
## Transformation and Redemption
One of the most prominent themes in "A Christmas Carol" is transformation, specifically that of Ebenezer Scrooge. At the beginning of the story, Scrooge is portrayed as a miserly, cold-hearted individual who despises Christmas and shows little compassion for others, particularly the poor. However, through a series of supernatural encounters with the Ghosts of Christmas Past, Present, and Yet to Come, Scrooge undergoes a profound transformation. These experiences prompt him to reflect on his life, allowing him to recognize the importance of kindness, generosity, and personal connection.
This theme of redemption is significant as it emphasizes the possibility of change at any point in life, regardless of one’s past actions. Scrooge’s eventual embrace of generosity and compassion, especially towards his employee Bob Cratchit and the sickly Tiny Tim, illustrates the story's underlying message that it is never too late to change one’s ways.
## The Spirit of Christmas
Another essential theme central to the narrative is the spirit of Christmas, representing joy, togetherness, and generosity. Throughout the story, Scrooge's disdain for the holiday season starkly contrasts with the vivacity and warmth exhibited by characters like his nephew Fred and the Cratchit family. Their celebrations highlight the joys of family, love, and community, all values that Scrooge initially overlooks.
The supernatural visits Scrooge receives serve as instruments that teach him about the significance of Christmas, not just as a holiday but as a time for reflection and reaching out to others. By embracing the festive spirit, Scrooge learns to appreciate familial bonds, community, and the act of giving, which collectively reinforce the importance of compassion and generosity during the holiday season.
## Social Responsibility and Compassion
The story also addresses themes of social responsibility and compassion, particularly concerning the plight of the poor during the Victorian era. Scrooge initially exhibits a cold indifference towards the poor, viewing their suffering as a consequence of their choices. However, through his interactions with the Cratchit family and the insights from the spirits, Scrooge comes to understand the socio-economic challenges they face.
The character of Tiny Tim embodies this theme, representing innocence and vulnerability. Scrooge's concern for Tiny Tim's future evokes a transformation in his attitude towards social responsibility. This shift not only improves his personal relationships but also reflects a broader message about the necessity of compassion and charity in addressing societal issues. Dickens employs Scrooge's journey to highlight the consequences of neglecting the disenfranchised and the importance of community support.
## The Consequences of Isolation
Isolation is another vital theme in "A Christmas Carol." Scrooge's initial isolation—both self-imposed and circumstantial—leads to a life devoid of warmth, joy, and authentic relationships. His disdain for social interactions and rejection of familial ties isolate him from the very essence of human connection.
The story illustrates how this isolation has severe emotional and mental effects, exemplified by Scrooge's loneliness and regret as he reflects on his life choices. Ultimately, as Scrooge reconnects with his family and the community, the narrative underscores the importance of human connection and the dangers of leading a solitary, emotionally barren life.
## Conclusion
In summary, "A Christmas Carol" masterfully weaves together themes of transformation, the spirit of Christmas, social responsibility, and the consequences of isolation. Through Scrooge's journey from an unfeeling miser to a compassionate figure, Dickens imparts crucial lessons about kindness, the importance of family, and the moral duty to care for one another. These enduring themes continue to resonate with readers today, cementing "A Christmas Carol" as a cherished holiday classic.
Description
With
HNSWVectorStorage
, the default example works with global search but local search was not tested (which is actually where the vector db querying happens). Local search did not work withHNSWVectorStorage
becausemeta_fields
were not correctly stored, causing the query results to return noentity_name
.Updates
upsert()
function by including themeta_fields
as part ofmetadata
which is tracked along with the labels and distances returned from query.upsert()
andquery()
.deepseek-chat
model.Benchmark
HNSWVectorStorage
vsNanoVectorDBStorage
:n=100_000, d=1024
n=1_000_000, d=1024
Logs
Insertion
Query
Global Search
Local Search