khoj-ai / khoj

Your AI second brain. Get answers to your questions, whether they be online or in your own notes. Use online AI models (e.g gpt4) or private, local LLMs (e.g llama3). Self-host locally or use our cloud instance. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.
https://khoj.dev
GNU Affero General Public License v3.0
12.64k stars 640 forks source link

Cant index file because .jsonl.gz' missing #488

Closed CrackheadoooObsidianUserooo closed 10 months ago

CrackheadoooObsidianUserooo commented 11 months ago

Checklist

Now I get this error when I Type Khoj

FileNotFoundError: [Errno 2] No such file or directory:
                    'C:\\Users\\Pascal\\.khoj\\content\\markdown\\C__Users_Pascal_Documents_Vaults_Upgra
                    de_Laptop_Upgradeo.jsonl.gz'
[14:50:26] ERROR    🚨 Failed to configure server on app load: [Errno 2] No such file or directory:      configure.py:44
                    'C:\\Users\\Pascal\\.khoj\\content\\markdown\\C__Users_Pascal_Documents_Vaults_Upgra
                    de_Laptop_Upgradeo.jsonl.gz'

`llama_model_load_internal: ftype      = 14 (mostly Q4_K - Small)
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 3647,96 MB
llama_model_load_internal: mem required  = 4045,96 MB (+ 1024,00 MB per state)
llama_new_context_with_model: kv self size  = 1024,00 MB
[14:50:20] INFO     🔍 📜 Setting up text search model                                                    indexer.py:167[14:50:22] INFO     🔍 🌄 Setting up image search model                                                   indexer.py:171[14:50:25] INFO     📬 Initializing content index...                                                     configure.py:78           INFO     Loading content from existing embeddings...                                           indexer.py:403           INFO     💎 Loading markdown notes                                                             indexer.py:414           ERROR    🚨 Failed to index content                                                           configure.py:92                    ╭─────────────────────── Traceback (most recent call last) ────────────────────────╮
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\co │
                    │ nfigure.py:79 in initialize_content                                              │
                    │                                                                                  │
                    │    76 │   │   try:                                                               │
                    │    77 │   │   │   if init:                                                       │
                    │    78 │   │   │   │   logger.info("📬 Initializing content index...")            │
                    │ ❱  79 │   │   │   │   state.content_index = load_content(state.config.content_ty │
                    │       state.content_index, state.search_models)                                  │
                    │    80 │   │   │   else:                                                          │
                    │    81 │   │   │   │   logger.info("📬 Updating content index...")                │
                    │    82 │   │   │   │   all_files = collect_files(state.config.content_type)       │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\ro │
                    │ uters\indexer.py:415 in load_content                                             │
                    │                                                                                  │
                    │   412 │   │   content_index.org = text_search.load(content_config.org, filters=[ │
                    │       WordFilter(), FileFilter()])                                               │
                    │   413 │   if content_config.markdown:                                            │
                    │   414 │   │   logger.info("💎 Loading markdown notes")                           │
                    │ ❱ 415 │   │   content_index.markdown = text_search.load(                         │
                    │   416 │   │   │   content_config.markdown, filters=[DateFilter(), WordFilter(),  │
                    │   417 │   │   )                                                                  │
                    │   418 │   if content_config.pdf:                                                 │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\se │
                    │ arch_type\text_search.py:240 in load                                             │
                    │                                                                                  │
                    │   237 ) -> TextContent:                                                          │
                    │   238 │   # Map notes in text files to (compressed) JSONL formatted file         │
                    │   239 │   config.compressed_jsonl = resolve_absolute_path(config.compressed_json │
                    │ ❱ 240 │   entries = extract_entries(config.compressed_jsonl)                     │
                    │   241 │                                                                          │
                    │   242 │   # Compute or Load Embeddings                                           │
                    │   243 │   config.embeddings_file = resolve_absolute_path(config.embeddings_file) │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\se │
                    │ arch_type\text_search.py:57 in extract_entries                                   │
                    │                                                                                  │
                    │    54                                                                            │
                    │    55 def extract_entries(jsonl_file) -> List[Entry]:                            │
                    │    56 │   "Load entries from compressed jsonl"                                   │
                    │ ❱  57 │   return list(map(Entry.from_dict, load_jsonl(jsonl_file)))              │
                    │    58                                                                            │
                    │    59                                                                            │
                    │    60 def compute_embeddings(                                                    │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\ut │
                    │ ils\jsonl.py:22 in load_jsonl                                                    │
                    │                                                                                  │
                    │   19 │                                                                           │
                    │   20 │   # Open JSONL file                                                       │
                    │   21 │   if input_path.suffix == ".gz":                                          │
                    │ ❱ 22 │   │   jsonl_file = gzip.open(get_absolute_path(input_path), "rt", encodin │
                    │   23 │   else:                                                                   │
                    │   24 │   │   jsonl_file = open(get_absolute_path(input_path), "r", encoding="utf │
                    │   25                                                                             │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\gzip.py:58 in open    │
                    │                                                                                  │
                    │    55 │                                                                          │
                    │    56 │   gz_mode = mode.replace("t", "")                                        │
                    │    57 │   if isinstance(filename, (str, bytes, os.PathLike)):                    │
                    │ ❱  58 │   │   binary_file = GzipFile(filename, gz_mode, compresslevel)           │
                    │    59 │   elif hasattr(filename, "read") or hasattr(filename, "write"):          │
                    │    60 │   │   binary_file = GzipFile(None, gz_mode, compresslevel, filename)     │
                    │    61 │   else:                                                                  │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\gzip.py:173 in        │
                    │ __init__                                                                         │
                    │                                                                                  │
                    │   170 │   │   if mode and 'b' not in mode:                                       │
                    │   171 │   │   │   mode += 'b'                                                    │
                    │   172 │   │   if fileobj is None:                                                │
                    │ ❱ 173 │   │   │   fileobj = self.myfileobj = builtins.open(filename, mode or 'rb │
                    │   174 │   │   if filename is None:                                               │
                    │   175 │   │   │   filename = getattr(fileobj, 'name', '')                        │
                    │   176 │   │   │   if not isinstance(filename, (str, bytes)):                     │
                    ╰──────────────────────────────────────────────────────────────────────────────────╯
                    FileNotFoundError: [Errno 2] No such file or directory:
                    'C:\\Users\\Pascal\\.khoj\\content\\markdown\\C__Users_Pascal_Documents_Vaults_Upgra
                    de_Laptop_Upgradeo.jsonl.gz'
           ERROR    🚨 Failed to configure search models                                                 configure.py:67                    ╭─────────────────────── Traceback (most recent call last) ────────────────────────╮
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\co │
                    │ nfigure.py:65 in configure_server                                                │
                    │                                                                                  │
                    │    62 │   │   state.config_lock.acquire()                                        │
                    │    63 │   │   state.SearchType = configure_search_types(state.config)            │
                    │    64 │   │   state.search_models = configure_search(state.search_models,        │
                    │       state.config.search_type)                                                  │
                    │ ❱  65 │   │   initialize_content(regenerate, search_type, init)                  │
                    │    66 │   except Exception as e:                                                 │
                    │    67 │   │   logger.error(f"🚨 Failed to configure search models", exc_info=Tru │
                    │    68 │   │   raise e                                                            │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\co │
                    │ nfigure.py:93 in initialize_content                                              │
                    │                                                                                  │
                    │    90 │   │   │   │   )                                                          │
                    │    91 │   │   except Exception as e:                                             │
                    │    92 │   │   │   logger.error(f"🚨 Failed to index content", exc_info=True)     │
                    │ ❱  93 │   │   │   raise e                                                        │
                    │    94                                                                            │
                    │    95                                                                            │
                    │    96 def configure_routes(app):                                                 │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\co │
                    │ nfigure.py:79 in initialize_content                                              │
                    │                                                                                  │
                    │    76 │   │   try:                                                               │
                    │    77 │   │   │   if init:                                                       │
                    │    78 │   │   │   │   logger.info("📬 Initializing content index...")            │
                    │ ❱  79 │   │   │   │   state.content_index = load_content(state.config.content_ty │
                    │       state.content_index, state.search_models)                                  │
                    │    80 │   │   │   else:                                                          │
                    │    81 │   │   │   │   logger.info("📬 Updating content index...")                │
                    │    82 │   │   │   │   all_files = collect_files(state.config.content_type)       │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\ro │
                    │ uters\indexer.py:415 in load_content                                             │
                    │                                                                                  │
                    │   412 │   │   content_index.org = text_search.load(content_config.org, filters=[ │
                    │       WordFilter(), FileFilter()])                                               │
                    │   413 │   if content_config.markdown:                                            │
                    │   414 │   │   logger.info("💎 Loading markdown notes")                           │
                    │ ❱ 415 │   │   content_index.markdown = text_search.load(                         │
                    │   416 │   │   │   content_config.markdown, filters=[DateFilter(), WordFilter(),  │
                    │   417 │   │   )                                                                  │
                    │   418 │   if content_config.pdf:                                                 │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\se │
                    │ arch_type\text_search.py:240 in load                                             │
                    │                                                                                  │
                    │   237 ) -> TextContent:                                                          │
                    │   238 │   # Map notes in text files to (compressed) JSONL formatted file         │
                    │   239 │   config.compressed_jsonl = resolve_absolute_path(config.compressed_json │
                    │ ❱ 240 │   entries = extract_entries(config.compressed_jsonl)                     │
                    │   241 │                                                                          │
                    │   242 │   # Compute or Load Embeddings                                           │
                    │   243 │   config.embeddings_file = resolve_absolute_path(config.embeddings_file) │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\se │
                    │ arch_type\text_search.py:57 in extract_entries                                   │
                    │                                                                                  │
                    │    54                                                                            │
                    │    55 def extract_entries(jsonl_file) -> List[Entry]:                            │
                    │    56 │   "Load entries from compressed jsonl"                                   │
                    │ ❱  57 │   return list(map(Entry.from_dict, load_jsonl(jsonl_file)))              │
                    │    58                                                                            │
                    │    59                                                                            │
                    │    60 def compute_embeddings(                                                    │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\ut │
                    │ ils\jsonl.py:22 in load_jsonl                                                    │
                    │                                                                                  │
                    │   19 │                                                                           │
                    │   20 │   # Open JSONL file                                                       │
                    │   21 │   if input_path.suffix == ".gz":                                          │
                    │ ❱ 22 │   │   jsonl_file = gzip.open(get_absolute_path(input_path), "rt", encodin │
                    │   23 │   else:                                                                   │
                    │   24 │   │   jsonl_file = open(get_absolute_path(input_path), "r", encoding="utf │
                    │   25                                                                             │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\gzip.py:58 in open    │
                    │                                                                                  │
                    │    55 │                                                                          │
                    │    56 │   gz_mode = mode.replace("t", "")                                        │
                    │    57 │   if isinstance(filename, (str, bytes, os.PathLike)):                    │
                    │ ❱  58 │   │   binary_file = GzipFile(filename, gz_mode, compresslevel)           │
                    │    59 │   elif hasattr(filename, "read") or hasattr(filename, "write"):          │
                    │    60 │   │   binary_file = GzipFile(None, gz_mode, compresslevel, filename)     │
                    │    61 │   else:                                                                  │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\gzip.py:173 in        │
                    │ __init__                                                                         │
                    │                                                                                  │
                    │   170 │   │   if mode and 'b' not in mode:                                       │
                    │   171 │   │   │   mode += 'b'                                                    │
                    │   172 │   │   if fileobj is None:                                                │
                    │ ❱ 173 │   │   │   fileobj = self.myfileobj = builtins.open(filename, mode or 'rb │
                    │   174 │   │   if filename is None:                                               │
                    │   175 │   │   │   filename = getattr(fileobj, 'name', '')                        │
                    │   176 │   │   │   if not isinstance(filename, (str, bytes)):                     │
                    ╰──────────────────────────────────────────────────────────────────────────────────╯
                    FileNotFoundError: [Errno 2] No such file or directory:
                    'C:\\Users\\Pascal\\.khoj\\content\\markdown\\C__Users_Pascal_Documents_Vaults_Upgra
                    de_Laptop_Upgradeo.jsonl.gz'
[14:50:26] ERROR    🚨 Failed to configure server on app load: [Errno 2] No such file or directory:      configure.py:44                    'C:\\Users\\Pascal\\.khoj\\content\\markdown\\C__Users_Pascal_Documents_Vaults_Upgra
                    de_Laptop_Upgradeo.jsonl.gz'
                    ╭─────────────────────── Traceback (most recent call last) ────────────────────────╮
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\co │
                    │ nfigure.py:42 in initialize_server                                               │
                    │                                                                                  │
                    │    39 │   │   return None                                                        │
                    │    40 │                                                                          │
                    │    41 │   try:                                                                   │
                    │ ❱  42 │   │   configure_server(config, init=True)                                │
                    │    43 │   except Exception as e:                                                 │
                    │    44 │   │   logger.error(f"🚨 Failed to configure server on app load: {e}", ex │
                    │    45                                                                            │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\co │
                    │ nfigure.py:68 in configure_server                                                │
                    │                                                                                  │
                    │    65 │   │   initialize_content(regenerate, search_type, init)                  │
                    │    66 │   except Exception as e:                                                 │
                    │    67 │   │   logger.error(f"🚨 Failed to configure search models", exc_info=Tru │
                    │ ❱  68 │   │   raise e                                                            │
                    │    69 │   finally:                                                               │
                    │    70 │   │   state.config_lock.release()                                        │
                    │    71                                                                            │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\co │
                    │ nfigure.py:65 in configure_server                                                │
                    │                                                                                  │
                    │    62 │   │   state.config_lock.acquire()                                        │
                    │    63 │   │   state.SearchType = configure_search_types(state.config)            │
                    │    64 │   │   state.search_models = configure_search(state.search_models,        │
                    │       state.config.search_type)                                                  │
                    │ ❱  65 │   │   initialize_content(regenerate, search_type, init)                  │
                    │    66 │   except Exception as e:                                                 │
                    │    67 │   │   logger.error(f"🚨 Failed to configure search models", exc_info=Tru │
                    │    68 │   │   raise e                                                            │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\co │
                    │ nfigure.py:93 in initialize_content                                              │
                    │                                                                                  │
                    │    90 │   │   │   │   )                                                          │
                    │    91 │   │   except Exception as e:                                             │
                    │    92 │   │   │   logger.error(f"🚨 Failed to index content", exc_info=True)     │
                    │ ❱  93 │   │   │   raise e                                                        │
                    │    94                                                                            │
                    │    95                                                                            │
                    │    96 def configure_routes(app):                                                 │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\co │
                    │ nfigure.py:79 in initialize_content                                              │
                    │                                                                                  │
                    │    76 │   │   try:                                                               │
                    │    77 │   │   │   if init:                                                       │
                    │    78 │   │   │   │   logger.info("📬 Initializing content index...")            │
                    │ ❱  79 │   │   │   │   state.content_index = load_content(state.config.content_ty │
                    │       state.content_index, state.search_models)                                  │
                    │    80 │   │   │   else:                                                          │
                    │    81 │   │   │   │   logger.info("📬 Updating content index...")                │
                    │    82 │   │   │   │   all_files = collect_files(state.config.content_type)       │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\ro │
                    │ uters\indexer.py:415 in load_content                                             │
                    │                                                                                  │
                    │   412 │   │   content_index.org = text_search.load(content_config.org, filters=[ │
                    │       WordFilter(), FileFilter()])                                               │
                    │   413 │   if content_config.markdown:                                            │
                    │   414 │   │   logger.info("💎 Loading markdown notes")                           │
                    │ ❱ 415 │   │   content_index.markdown = text_search.load(                         │
                    │   416 │   │   │   content_config.markdown, filters=[DateFilter(), WordFilter(),  │
                    │   417 │   │   )                                                                  │
                    │   418 │   if content_config.pdf:                                                 │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\se │
                    │ arch_type\text_search.py:240 in load                                             │
                    │                                                                                  │
                    │   237 ) -> TextContent:                                                          │
                    │   238 │   # Map notes in text files to (compressed) JSONL formatted file         │
                    │   239 │   config.compressed_jsonl = resolve_absolute_path(config.compressed_json │
                    │ ❱ 240 │   entries = extract_entries(config.compressed_jsonl)                     │
                    │   241 │                                                                          │
                    │   242 │   # Compute or Load Embeddings                                           │
                    │   243 │   config.embeddings_file = resolve_absolute_path(config.embeddings_file) │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\se │
                    │ arch_type\text_search.py:57 in extract_entries                                   │
                    │                                                                                  │
                    │    54                                                                            │
                    │    55 def extract_entries(jsonl_file) -> List[Entry]:                            │
                    │    56 │   "Load entries from compressed jsonl"                                   │
                    │ ❱  57 │   return list(map(Entry.from_dict, load_jsonl(jsonl_file)))              │
                    │    58                                                                            │
                    │    59                                                                            │
                    │    60 def compute_embeddings(                                                    │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\site-packages\khoj\ut │
                    │ ils\jsonl.py:22 in load_jsonl                                                    │
                    │                                                                                  │
                    │   19 │                                                                           │
                    │   20 │   # Open JSONL file                                                       │
                    │   21 │   if input_path.suffix == ".gz":                                          │
                    │ ❱ 22 │   │   jsonl_file = gzip.open(get_absolute_path(input_path), "rt", encodin │
                    │   23 │   else:                                                                   │
                    │   24 │   │   jsonl_file = open(get_absolute_path(input_path), "r", encoding="utf │
                    │   25                                                                             │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\gzip.py:58 in open    │
                    │                                                                                  │
                    │    55 │                                                                          │
                    │    56 │   gz_mode = mode.replace("t", "")                                        │
                    │    57 │   if isinstance(filename, (str, bytes, os.PathLike)):                    │
                    │ ❱  58 │   │   binary_file = GzipFile(filename, gz_mode, compresslevel)           │
                    │    59 │   elif hasattr(filename, "read") or hasattr(filename, "write"):          │
                    │    60 │   │   binary_file = GzipFile(None, gz_mode, compresslevel, filename)     │
                    │    61 │   else:                                                                  │
                    │                                                                                  │
                    │ C:\Users\Pascal\AppData\Local\Programs\Python\Python39\lib\gzip.py:173 in        │
                    │ __init__                                                                         │
                    │                                                                                  │
                    │   170 │   │   if mode and 'b' not in mode:                                       │
                    │   171 │   │   │   mode += 'b'                                                    │
                    │   172 │   │   if fileobj is None:                                                │
                    │ ❱ 173 │   │   │   fileobj = self.myfileobj = builtins.open(filename, mode or 'rb │
                    │   174 │   │   if filename is None:                                               │
                    │   175 │   │   │   filename = getattr(fileobj, 'name', '')                        │
                    │   176 │   │   │   if not isinstance(filename, (str, bytes)):                     │
                    ╰──────────────────────────────────────────────────────────────────────────────────╯
                    FileNotFoundError: [Errno 2] No such file or directory:
                    'C:\\Users\\Pascal\\.khoj\\content\\markdown\\C__Users_Pascal_Documents_Vaults_Upgra
                    de_Laptop_Upgradeo.jsonl.gz'
           INFO     🌖 Khoj is ready to use                                                                   main.py:90           INFO     Started server process [9432]                                                           server.py:75           INFO     Waiting for application startup.                                                            on.py:45           INFO     Application startup complete.                                                               on.py:59           INFO     Uvicorn running on http://127.0.0.1:42110 (Press CTRL+C to quit)                       server.py:206

image

`

sabaimran commented 11 months ago

What happens when you start Obsidian? Were you previously able to set it up and run it?

Khoj isn't able to find the file where your embeddings are supposed to be stored for your vault. This can happen perhaps if they weren't generated correctly the first time.

difhel commented 11 months ago

The same issue.

FileNotFoundError: [Errno 2] No such file or directory:
'/home/mf/.khoj/content/markdown/markdown.jsonl.gz'

But I've never been able to get khoj to index the markdown files. This error is displayed from the first configuration.

difhel commented 11 months ago

I checked if there is the "markdown.jsonl.gz" file, but it seems that the folder "content" was not created yet, which is quite weird because I had configured files via "localhost/config" several times. image

sabaimran commented 11 months ago

Ah, did you click on 'configure' in between? Is your config managed by obsidian?

I think this may be because the content is attempting to be read before it's configured. This would be a bug on our end, but let me see if I can reproduce it.

On Fri, Oct 6, 2023, 03:08 Mark Fomin @.***> wrote:

I checked if there is the "markdown.jsonl.gz" file, but it seems that the folder "content" was not created yet, which is quite weird because I had configured files via "localhost/config" several times. [image: image] https://user-images.githubusercontent.com/78644136/273175608-1c986f9b-155a-44d2-a7d4-2e007bf19a06.png

— Reply to this email directly, view it on GitHub https://github.com/khoj-ai/khoj/issues/488#issuecomment-1750335018, or unsubscribe https://github.com/notifications/unsubscribe-auth/APRMB23FB6S4Z5ZAZ3VGZ4DX57KA5AVCNFSM6AAAAAA5OJ6P3OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJQGMZTKMBRHA . You are receiving this because you commented.Message ID: @.***>

lrq3000 commented 11 months ago

Khoj isn't able to find the file where your embeddings are supposed to be stored for your vault. This can happen perhaps if they weren't generated correctly the first time.

This fixed the issue for me. The issue was that the first time around I had activated the offline chat feature, but it crashed with an OutOfMemory error when I clicked on the Configure button. The next time I restarted khoj, I got OP's error about a missing markdown index file. All I had to do was to let this buggy khoj continue running, access the control panel again, disable the offline chat model, and then click again on the Configure button, then the index file was correctly built.

I suggest khoj should try to do a check first about available memory, or at least manage more gracefully this exception to avoid crashing fully on the first time the user tries to Configure if they don't have enough RAM despite activating offline chat. Assuming OP's issue is the same as me.

/EDIT: I had to click on the Reinitialize button, otherwise with just the Configure button the interface would run but no pertinent result would be returned when doing a search. I can confirm that on m side, after doing Reinitialization, all features including search and offline chat work as espected.

sabaimran commented 10 months ago

Thanks for the detailing @lrq3000. I do think a lot of issues in particular occur when loading the offline chat model into application memory. It's a good idea to add an error/warning if we detected there's not enough available RAM to load it.

There may be a UX issue in the particular difference in the configure and reinitialize functionality as well. Generally, reinitialize is the best bet if there's any suspected misconfigured/corrupted index.