khoj-ai / khoj

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (e.g gpt, claude, gemini, llama, qwen, mistral).
https://khoj.dev
GNU Affero General Public License v3.0
14.75k stars 735 forks source link

Directories suffixed with .org cause a loud error #448

Closed nickanderson closed 1 year ago

nickanderson commented 1 year ago

I have some directories in my org files that are suffixed with .org. This causes loud errors which I think should not be so loud.

Here are two offenders, they are indeed directories. Rather than ERROR, I think at most these should be something like INFO skipped parsing because it is a directory, not a file.

[04:06:18 PM] ERROR    Error processing file: /home/nickanderson/Syncthing/Orgzly/pages/data:image/svg+xml,<svg xmlns='http:/www.w3.org with error: [Errno 21] Is a directory: "/home/nickanderson/Syncthing/Orgzly/pages/data:image/svg+xml,<svg                       org_to_jsonl.py:103
                       xmlns='http:/www.w3.org"                                                                                                                                                                                                                                            
                       ╭───────────────────────────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────────────────────────────────────────────────────────╮                    
                       │ /home/nickanderson/.local/lib/python3.10/site-packages/khoj/processor/org_mode/org_to_jsonl.py:99 in extract_org_entries                                                                                                                     │                    
                       │                                                                                                                                                                                                                                              │                    
                       │    96 │   │   entry_to_file_map = []                                                                                                                                                                                                         │                    
                       │    97 │   │   for org_file in org_files:                                                                                                                                                                                                     │                    
                       │    98 │   │   │   try:                                                                                                                                                                                                                       │                    
                       │ ❱  99 │   │   │   │   org_file_entries = orgnode.makelist_with_filepath(str(org_file))                                                                                                                                                       │                    
                       │   100 │   │   │   │   entry_to_file_map += zip(org_file_entries, [org_file] *                                                                                                                                                                │                    
                       │       len(org_file_entries))                                                                                                                                                                                                                 │                    
                       │   101 │   │   │   │   entries.extend(org_file_entries)                                                                                                                                                                                       │                    
                       │   102 │   │   │   except Exception as e:                                                                                                                                                                                                     │                    
                       │                                                                                                                                                                                                                                              │                    
                       │ /home/nickanderson/.local/lib/python3.10/site-packages/khoj/processor/org_mode/orgnode.py:57 in makelist_with_filepath                                                                                                                       │                    
                       │                                                                                                                                                                                                                                              │                    
                       │    54                                                                                                                                                                                                                                        │                    
                       │    55                                                                                                                                                                                                                                        │                    
                       │    56 def makelist_with_filepath(filename):                                                                                                                                                                                                  │                    
                       │ ❱  57 │   f = open(filename, "r")                                                                                                                                                                                                            │                    
                       │    58 │   return makelist(f, filename)                                                                                                                                                                                                       │                    
                       │    59                                                                                                                                                                                                                                        │                    
                       │    60                                                                                                                                                                                                                                        │                    
                       ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯                    
                       IsADirectoryError: [Errno 21] Is a directory: "/home/nickanderson/Syncthing/Orgzly/pages/data:image/svg+xml,<svg xmlns='http:/www.w3.org"                                                                                                                           
[04:06:19 PM] ERROR    Error processing file: /home/nickanderson/org/roam/cmdln.org with error: [Errno 21] Is a directory: '/home/nickanderson/org/roam/cmdln.org'                                                                                                      org_to_jsonl.py:103
                       ╭───────────────────────────────────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ──────────────────────────────────────────────────────────────────────────────────────────────────────╮                    
                       │ /home/nickanderson/.local/lib/python3.10/site-packages/khoj/processor/org_mode/org_to_jsonl.py:99 in extract_org_entries                                                                                                                     │                    
                       │                                                                                                                                                                                                                                              │                    
                       │    96 │   │   entry_to_file_map = []                                                                                                                                                                                                         │                    
                       │    97 │   │   for org_file in org_files:                                                                                                                                                                                                     │                    
                       │    98 │   │   │   try:                                                                                                                                                                                                                       │                    
                       │ ❱  99 │   │   │   │   org_file_entries = orgnode.makelist_with_filepath(str(org_file))                                                                                                                                                       │                    
                       │   100 │   │   │   │   entry_to_file_map += zip(org_file_entries, [org_file] *                                                                                                                                                                │                    
                       │       len(org_file_entries))                                                                                                                                                                                                                 │                    
                       │   101 │   │   │   │   entries.extend(org_file_entries)                                                                                                                                                                                       │                    
                       │   102 │   │   │   except Exception as e:                                                                                                                                                                                                     │                    
                       │                                                                                                                                                                                                                                              │                    
                       │ /home/nickanderson/.local/lib/python3.10/site-packages/khoj/processor/org_mode/orgnode.py:57 in makelist_with_filepath                                                                                                                       │                    
                       │                                                                                                                                                                                                                                              │                    
                       │    54                                                                                                                                                                                                                                        │                    
                       │    55                                                                                                                                                                                                                                        │                    
                       │    56 def makelist_with_filepath(filename):                                                                                                                                                                                                  │                    
                       │ ❱  57 │   f = open(filename, "r")                                                                                                                                                                                                            │                    
                       │    58 │   return makelist(f, filename)                                                                                                                                                                                                       │                    
                       │    59                                                                                                                                                                                                                                        │                    
                       │    60                                                                                                                                                                                                                                        │                    
                       ╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯                    
                       IsADirectoryError: [Errno 21] Is a directory: '/home/nickanderson/org/roam/cmdln.org'                                                       
debanjum commented 1 year ago

Thanks for creating an issue and sharing stacktrace! I was able to investigate and fix the bug with commit https://github.com/khoj-ai/khoj/commit/e3cd8b415061c5167861c7ca8435b4eb521a712a. Now directories suffixed with .org etc will just be ignored while indexing instead of throwing error. Any files under such directories can still be indexed with the appropriate glob (e.g /path/to/notes/**/*.org).

Feel free to reopen this issue if the problem still persists on your end with the latest khoj server