langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.85k stars 15.35k forks source link

Github toolkits create_file always return 404 when file does not exists #11055

Closed qsbao closed 9 months ago

qsbao commented 1 year ago

System Info

Python 3.11.4 langchain 0.0.301 PyGithub 1.59.1

Who can help?

No response

Information

Related Components

Reproduction

  1. Create a github repository with no README file, create a issue with title Create README file and content Find what the most popular frontend framework is right now and add a short blurb to the readme.md file about how this website will take advantage of it..

  2. Run official example Github Simple Agent.

    agent.run(
    "You have the software engineering capabilities of a Google Principle engineer. You are tasked with completing issues on a github repository. Please look at the existing issues and complete them."
    )
  3. Get run logs:

    > Entering new AgentExecutor chain...
    I need to figure out what issues need to be completed.
    Action: Get Issues
    Action Input: N/A
    Observation: Found 1 issues:
    [{'title': 'Create README file', 'number': 1}]
    Thought: I need to create the README file
    Action: Create File
    Action Input: README.md
    Observation: Unable to make file due to error:
    404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/repos/contents#get-repository-content"}
    Thought: I need to create the file in the repository
    Action: Create File
    Action Input: /test/README.md
    Observation: Unable to make file due to error:
    404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/repos/contents#get-repository-content"}
    Thought: I need to create the file in the correct repository
    Action: Create File
    Action Input: <repository name>/test/README.md
    Observation: Unable to make file due to error:
    404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/repos/contents#get-repository-content"}
    Thought: I need to make sure I am in the correct repository
    Action: Get Issues
    Action Input: N/A
    Observation: Found 1 issues:
    [{'title': 'Create README file', 'number': 1}]
    Thought: I now know the correct repository and can create the file
    Action: Create File
    Action Input: <correct repository name>/test/README.md
    Observation: Unable to make file due to error:
    404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/repos/contents#get-repository-content"}
    Thought:
    ...

Expected behavior

Successfully create file README.md in repository.

dosubot[bot] commented 1 year ago

🤖

Hello,

Thank you for bringing this issue to our attention. The problem you're experiencing is due to the way the create_file function in the GitHubAPIWrapper class is currently implemented. Before creating a new file, it checks if a file with the same name already exists in the repository using the get_contents method. If the file does not exist, the get_contents method will throw a GithubException with a status of 404, which is not being handled in the create_file function.

Here is the relevant code:

def create_file(self, file_query: str) -> str:
    file_path = file_query.split("\n")[0]
    file_contents = file_query[len(file_path) + 2 :]
    try:
        exists = self.github_repo_instance.get_contents(file_path)
        if exists is None:
            self.github_repo_instance.create_file(
                path=file_path,
                message="Create " + file_path,
                content=file_contents,
                branch=self.github_branch,
            )
            return "Created file " + file_path
        else:
            return f"File already exists at {file_path}. Use update_file instead"
    except Exception as e:
        return "Unable to make file due to error:\n" + str(e)

To fix this issue, you need to handle the GithubException with a status of 404 separately. Here is an example of how you can do this:

from github import GithubException

def create_file(self, file_query: str) -> str:
    file_path = file_query.split("\n")[0]
    file_contents = file_query[len(file_path) + 2 :]
    try:
        exists = self.github_repo_instance.get_contents(file_path)
    except GithubException as e:
        if e.status == 404:  # File does not exist
            exists = None
        else:
            return "Unable to check if file exists due to error:\n" + str(e)
    except Exception as e:
        return "Unable to check if file exists due to error:\n" + str(e)

    if exists is None:
        try:
            self.github_repo_instance.create_file(
                path=file_path,
                message="Create " + file_path,
                content=file_contents,
                branch=self.github_branch,
            )
            return "Created file " + file_path
        except Exception as e:
            return "Unable to create file due to error:\n" + str(e)
    else:
        return f"File already exists at {file_path}. Use update_file instead"

This code will catch the GithubException with a status of 404 and set exists to None, allowing the function to proceed with creating the new file. If the GithubException has a status other than 404, or if another type of exception is raised, the function will return an error message.

I hope this helps! If you have any other questions, feel free to ask.

Best, Dosu

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] commented 10 months ago

Hi, @qsbao,

I'm helping the LangChain team manage their backlog and am marking this issue as stale. From what I understand, you encountered a 404 error when using the create_file function in the GitHub toolkit to create a file that does not exist in the repository. I provided a detailed response, explaining the issue and suggesting a code modification to handle the GithubException with a status of 404. I also included code examples and a link to the relevant file in the repository.

Could you please confirm if this issue is still relevant to the latest version of the LangChain repository? If it is, please let the LangChain team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!