Closed john-b-yang closed 2 months ago
Weird. I tried to reproduce but could generate embeddings. Can you enable logging with logging.basicConfig(level=logging.INFO)
and provide the logs?
Oh cool ok that produced some warnings:
INFO:moatless.index.code_index:Initiated CodeIndex None with:
* 0 classes
* 0 functions
* 0 vectors
INFO:moatless.index.code_index:Read 82 documents
WARNING:llama_index.core.node_parser.node_utils:Failed to use epic splitter to split docs/conf.py. Fallback to treesitter_split(). Error: too many values to unpack (expected 2)
WARNING:llama_index.core.node_parser.node_utils:Failed to use epic splitter to split examples/celery/make_celery.py. Fallback to treesitter_split(). Error: too many values to unpack (expected 2)
(And then many mor repetitions of this error message)
Perhaps I didn't install something correctly?
Ah ok so the error is coming from here. Will play around with it a bit more.
Update: I realized the main problem is just that I was developing on mac haha, I switched to a linux machine and it's all good!
I think this line was throwing the error. captures
is a dictionary, not a list of tuples(?). I tried changing it to captures.items()
but was still unable to produce the result and I didn't look further.
It might've been because I was using tree-sitter-python==0.23.2
(0.21.0
which is required by this repo is not supported for arm, discussed here)
Aha, I got the same error when I tried to upgrade tree-sitter.
Thanks for all the really inspiring work on SWE-bench + programming agents 😄
I had a quick question. I'm trying to run the
00_index_and_run.ipynb
notebook. I'm attempting to run flask through the repository.I've done the following steps:
pallets/flask
locallyOPENAI_API_KEY
in a.env
file located withinnotebooks/
An OPENAI_API_KEY is required to use the OpenAI Models
model = "gpt-4o-2024-05-13" index_settings = IndexSettings( embed_model="text-embedding-3-small" )
repo_dir = "/absolute/path/to/flask" file_repo = FileRepository(repo_path=repo_dir)
code_index = CodeIndex(file_repo=file_repo, settings=index_settings) nodes, tokens = code_index.run_ingestion() print(f"Indexed {nodes} nodes and {tokens} tokens")