transformerlab / transformerlab-app

Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
https://transformerlab.ai/
GNU Affero General Public License v3.0
388 stars 18 forks source link

Error with Llama_Index RAG plugin - NLTK related (WSL) #165

Open mferris77 opened 2 days ago

mferris77 commented 2 days ago

Hi there - first off, I wanted to commend you on this application. It's clear you've put a LOT of work into this and I'm quite surprised it hasn't received more attention - please keep up the fantastic work.

I'm running on Windows via WSL, and generally everything is working well, until I installed the RAG plugin and tried to Query Docs. When I try to run this, I get an error and the output points to an NLTK error pointing to a missing file path, specifically:

  File "/home/user/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/nltk/data.py", line 312, in __init__
    raise OSError("No such file or directory: %r" % _path)
OSError: No such file or directory: '/home/user/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/_static/nltk_cache/tokenizers/punkt/PY3_tab'

I tried to figure this out myself but I'm not sure how best to fix this without breaking something. I did try the "import nltk / nltk.download('punkt')" trick but it isn't downloading them to that location even though I have the conda env activated. Running 'pip list' shows I have the following versions installed:

nltk 3.8.1
llama-index 0.11.18

I did find this discussion on the NLTK GH page which says NLTK v3.8.1 doesn't support punkt_tab and some suggest upgrading NLTK to 3.9.1. I tried to figure out what version it should be, but I'm unable to find any version references to either llama_index or NLTK.

I suppose the simplest thing to try is to upgrade NTLK to 3.9.1, but wanted to inquire here as 1) this may indicate an update is needed with an installation somewhere along the line and 2) if I should instead try something else (different llama-index version, etc).

Thanks and keep up the good work!

aliasaria commented 2 days ago

Thank you for your kind comment. I think we haven't tested this plugin on WSL and we're noticing that many things that interact with the filesystem need special work. We'll take a look at this asap.