langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
86.79k stars 13.51k forks source link

GitLoader Not working #21331

Closed PrashantDixit0 closed 3 weeks ago

PrashantDixit0 commented 1 month ago

Checked other resources

Example Code

from langchain_community.document_loaders import GitLoader

docs = GitLoader( clone_url=query_path, repo_path=temp_repo_dir, file_filter=lambda file_path: file_path.endswith(".py") or file_path.endswith(".md") or file_path.endswith(".js"), ) docs = docs.load()

Error Message and Stack Trace (if applicable)

GitCommandError: Cmd('git') failed due to: exit code(128) cmdline: git clone -v -- https://github.com/antar-ai/yolo-examples.git ./example_data/test_repo1/ stderr: 'Cloning into './example_data/test_repo1'... POST git-upload-pack (175 bytes) POST git-upload-pack (217 bytes) error: RPC failed; curl 92 HTTP/2 stream 0 was not closed cleanly: CANCEL (err 8) error: 3507 bytes of body are still expected fetch-pack: unexpected disconnect while reading sideband packet fatal: early EOF fatal: fetch-pack: invalid index-pack output '

Description

I am using GitLoader, to load all the files which are of Python, JS and Markdown but not able to load because of package

System Info

langchain==0.1.2 langchain-community==0.0.14 langchain-core==0.1.14

Platform-Linux Python 3.11.4

jonppe commented 3 weeks ago

I decided to try your code and it works after adding argument branch="master" for GitLoader(). Perhaps you could double check that it wasn't just a network issue, full disk or something similar.

Just in case you still think its a but, here's the versions I used Python 3.10.12

gitdb 4.0.11 GitPython 3.1.43 langchain 0.1.20 langchain-community 0.0.38 langchain-core 0.1.52 langchain-text-splitters 0.0.1 langsmith 0.1.57

PrashantDixit0 commented 3 weeks ago

@jonppe Thank you, I have solved this issue. Its just that by default it takes main branch if not present manually we have to specify branch