danswer-ai / danswer

Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
https://danswer.ai
Other
10.6k stars 1.33k forks source link

Integration Issue: Zulip Connector Fails to Determine Tempfile Directory #1452

Open sebastianelsner opened 6 months ago

sebastianelsner commented 6 months ago

I was trying to integrate danswer with Zulip but am running into an issue:

danswer-stack-background-1              | 05/13/2024 02:30:31 PM      run_indexing.py  71 : [Attempt ID: 2] Unable to instantiate connector due to Could not determine tempfile directory
danswer-stack-background-1              | Traceback (most recent call last):
danswer-stack-background-1              |   File "/app/danswer/background/indexing/run_indexing.py", line 60, in _get_document_generator
danswer-stack-background-1              |     runnable_connector, new_credential_json = instantiate_connector(
danswer-stack-background-1              |                                               ^^^^^^^^^^^^^^^^^^^^^^
danswer-stack-background-1              |   File "/app/danswer/connectors/factory.py", line 112, in instantiate_connector
danswer-stack-background-1              |     new_credentials = connector.load_credentials(credentials)
danswer-stack-background-1              |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
danswer-stack-background-1              |   File "/app/danswer/connectors/zulip/connector.py", line 50, in load_credentials
danswer-stack-background-1              |     raise Exception("Could not determine tempfile directory")
danswer-stack-background-1              | Exception: Could not determine tempfile directory
danswer-stack-background-1              | 05/13/2024 02:30:31 PM      run_indexing.py 388 : [Attempt ID: 2] Indexing job with ID '2' failed due to Could not determine tempfile directory
danswer-stack-background-1              | Traceback (most recent call last):
danswer-stack-background-1              |   File "/app/danswer/background/indexing/run_indexing.py", line 380, in run_indexing_entrypoint
danswer-stack-background-1              |     _run_indexing(db_session, attempt)
danswer-stack-background-1              |   File "/app/danswer/background/indexing/run_indexing.py", line 168, in _run_indexing
danswer-stack-background-1              |     doc_batch_generator, is_listing_complete = _get_document_generator(
danswer-stack-background-1              |                                                ^^^^^^^^^^^^^^^^^^^^^^^^
danswer-stack-background-1              |   File "/app/danswer/background/indexing/run_indexing.py", line 73, in _get_document_generator
danswer-stack-background-1              |     raise e
danswer-stack-background-1              |   File "/app/danswer/background/indexing/run_indexing.py", line 60, in _get_document_generator
danswer-stack-background-1              |     runnable_connector, new_credential_json = instantiate_connector(
danswer-stack-background-1              |                                               ^^^^^^^^^^^^^^^^^^^^^^
danswer-stack-background-1              |   File "/app/danswer/connectors/factory.py", line 112, in instantiate_connector
danswer-stack-background-1              |     new_credentials = connector.load_credentials(credentials)
danswer-stack-background-1              |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
danswer-stack-background-1              |   File "/app/danswer/connectors/zulip/connector.py", line 50, in load_credentials
danswer-stack-background-1              |     raise Exception("Could not determine tempfile directory")
danswer-stack-background-1              | Exception: Could not determine tempfile directory

As far as I can see this place is the culprit.

https://github.com/danswer-ai/danswer/blame/546815dc8cf462a8b8aedf729fbd2897804ea5e0/backend/danswer/connectors/zulip/connector.py#L50`

Looking at the python docs, tempfile.tempdir will allways be None

If tempdir is None (the default) at any call to any of the above functions except gettempprefix() it is initialized following the algorithm described in gettempdir().

So the tempdir value is not initialized anywhere.

NexZhu commented 2 months ago

Same here

ATSiem commented 1 month ago

Same issue.

Unable to instantiate connector due to Could not determine tempfile directory

Marking in-progress attempt 'connector: 1, credential: 1' as failed due to Stopped mid run, likely due to the background process being killed

FYI @elo-siema

ATSiem commented 1 month ago

@sebastianelsner I was able to resolve this issue, and get my Zulip Connector running!

Danswer has successfully fetched, processed, and inserted messages from Zulip into its database.

Here are the details:

Issue:

The Zulip Connector in Danswer was raising the following error during indexing:

Exception: Could not determine tempfile directory

Followed by a subsequent error:

TypeError: expected str, bytes or os.PathLike object, not NoneType

  1. I updated the Docker Compose configuration to explicitly set the TMPDIR environment variable to /tmp for the background service to ensure a writable temp directory was used within the container.
background:
  environment:
    - TMPDIR=/tmp
  1. I verified that the /tmp directory existed within the container, was writable, and that the TMPDIR variable was set correctly using:
echo $TMPDIR
ls -ld /tmp
touch /tmp/testfile.txt
  1. I modified the Zulip Connector's Python code in zulip/connector.py to explicitly use the tempfile.gettempdir() function instead of relying on a manually set or inferred tempdir. This method always returns the correct temporary directory determined by the Python environment.

Before:

if tempfile.tempdir is None:
    raise Exception("Could not determine tempfile directory")
config_file = os.path.join(tempdir, f"zuliprc-{self.realm_name}")

After:

config_file = os.path.join(tempfile.gettempdir(), f"zuliprc-{self.realm_name}")