khoj-ai / khoj

Your AI second brain. Get answers to your questions, whether they be online or in your own notes. Use online AI models (e.g gpt4) or private, local LLMs (e.g llama3). Self-host locally or use our cloud instance. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.
https://khoj.dev
GNU Affero General Public License v3.0
12.64k stars 640 forks source link

Directories suffixed with .org cause a loud error #448

Closed nickanderson closed 11 months ago

nickanderson commented 1 year ago

I have some directories in my org files that are suffixed with .org. This causes loud errors which I think should not be so loud.

Here are two offenders, they are indeed directories. Rather than ERROR, I think at most these should be something like INFO skipped parsing because it is a directory, not a file.

[04:06:18 PM] ERROR    Error processing file: /home/nickanderson/Syncthing/Orgzly/pages/data:image/svg+xml,<svg xmlns='http:/www.w3.org with error: [Errno 21] Is a directory: "/home/nickanderson/Syncthing/Orgzly/pages/data:image/svg+xml,<svg                       org_to_jsonl.py:103
                       xmlns='http:/www.w3.org"                                                                                                                                                                                                                                            
                       ╭───────────────────────────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────────────────────────────────────────────────────────╮                    
                       │ /home/nickanderson/.local/lib/python3.10/site-packages/khoj/processor/org_mode/org_to_jsonl.py:99 in extract_org_entries                                                                                                                     │                    
                       │                                                                                                                                                                                                                                              │                    
                       │    96 │   │   entry_to_file_map = []                                                                                                                                                                                                         │                    
                       │    97 │   │   for org_file in org_files:                                                                                                                                                                                                     │                    
                       │    98 │   │   │   try:                                                                                                                                                                                                                       │                    
                       │ ❱  99 │   │   │   │   org_file_entries = orgnode.makelist_with_filepath(str(org_file))                                                                                                                                                       │                    
                       │   100 │   │   │   │   entry_to_file_map += zip(org_file_entries, [org_file] *                                                                                                                                                                │                    
                       │       len(org_file_entries))                                                                                                                                                                                                                 │                    
                       │   101 │   │   │   │   entries.extend(org_file_entries)                                                                                                                                                                                       │                    
                       │   102 │   │   │   except Exception as e:                                                                                                                                                                                                     │                    
                       │                                                                                                                                                                                                                                              │                    
                       │ /home/nickanderson/.local/lib/python3.10/site-packages/khoj/processor/org_mode/orgnode.py:57 in makelist_with_filepath                                                                                                                       │                    
                       │                                                                                                                                                                                                                                              │                    
                       │    54                                                                                                                                                                                                                                        │                    
                       │    55                                                                                                                                                                                                                                        │                    
                       │    56 def makelist_with_filepath(filename):                                                                                                                                                                                                  │                    
                       │ ❱  57 │   f = open(filename, "r")                                                                                                                                                                                                            │                    
                       │    58 │   return makelist(f, filename)                                                                                                                                                                                                       │                    
                       │    59                                                                                                                                                                                                                                        │                    
                       │    60                                                                                                                                                                                                                                        │                    
                       ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯                    
                       IsADirectoryError: [Errno 21] Is a directory: "/home/nickanderson/Syncthing/Orgzly/pages/data:image/svg+xml,<svg xmlns='http:/www.w3.org"                                                                                                                           
[04:06:19 PM] ERROR    Error processing file: /home/nickanderson/org/roam/cmdln.org with error: [Errno 21] Is a directory: '/home/nickanderson/org/roam/cmdln.org'                                                                                                      org_to_jsonl.py:103
                       ╭───────────────────────────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────────────────────────────────────────────────────────╮                    
                       │ /home/nickanderson/.local/lib/python3.10/site-packages/khoj/processor/org_mode/org_to_jsonl.py:99 in extract_org_entries                                                                                                                     │                    
                       │                                                                                                                                                                                                                                              │                    
                       │    96 │   │   entry_to_file_map = []                                                                                                                                                                                                         │                    
                       │    97 │   │   for org_file in org_files:                                                                                                                                                                                                     │                    
                       │    98 │   │   │   try:                                                                                                                                                                                                                       │                    
                       │ ❱  99 │   │   │   │   org_file_entries = orgnode.makelist_with_filepath(str(org_file))                                                                                                                                                       │                    
                       │   100 │   │   │   │   entry_to_file_map += zip(org_file_entries, [org_file] *                                                                                                                                                                │                    
                       │       len(org_file_entries))                                                                                                                                                                                                                 │                    
                       │   101 │   │   │   │   entries.extend(org_file_entries)                                                                                                                                                                                       │                    
                       │   102 │   │   │   except Exception as e:                                                                                                                                                                                                     │                    
                       │                                                                                                                                                                                                                                              │                    
                       │ /home/nickanderson/.local/lib/python3.10/site-packages/khoj/processor/org_mode/orgnode.py:57 in makelist_with_filepath                                                                                                                       │                    
                       │                                                                                                                                                                                                                                              │                    
                       │    54                                                                                                                                                                                                                                        │                    
                       │    55                                                                                                                                                                                                                                        │                    
                       │    56 def makelist_with_filepath(filename):                                                                                                                                                                                                  │                    
                       │ ❱  57 │   f = open(filename, "r")                                                                                                                                                                                                            │                    
                       │    58 │   return makelist(f, filename)                                                                                                                                                                                                       │                    
                       │    59                                                                                                                                                                                                                                        │                    
                       │    60                                                                                                                                                                                                                                        │                    
                       ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯                    
                       IsADirectoryError: [Errno 21] Is a directory: '/home/nickanderson/org/roam/cmdln.org'                                                       
debanjum commented 11 months ago

Thanks for creating an issue and sharing stacktrace! I was able to investigate and fix the bug with commit https://github.com/khoj-ai/khoj/commit/e3cd8b415061c5167861c7ca8435b4eb521a712a. Now directories suffixed with .org etc will just be ignored while indexing instead of throwing error. Any files under such directories can still be indexed with the appropriate glob (e.g /path/to/notes/**/*.org).

Feel free to reopen this issue if the problem still persists on your end with the latest khoj server