modelscope / modelscope-agent

ModelScope-Agent: An agent framework connecting models in ModelScope with the world
https://modelscope-agent.readthedocs.io/en/latest/
Apache License 2.0
2.61k stars 298 forks source link

CodexGraph - building is done but nothing was added to Neo4J #571

Open lambdaofgod opened 1 month ago

lambdaofgod commented 1 month ago

Initial Checks

What happened + What you expected to happen

I am trying to index a repository with CodexGraph, everything is supposedly working - I tested the connection and it works fine, the logs do not give any errors, but after I run "build" I can't see any records added to Neo4J.

2024-08-11 14 44 52 localhost 8515bac636ce

Part of logs (note that the progress bars were added by me to check out if it was extraction that failed, apparently it did not)

Successfully processed /home/kuba/Projects/github_search/org/llama_prompting.py
Successfully processed /home/kuba/Projects/github_search/org/target_vocab.py
Successfully processed /home/kuba/Projects/github_search/org/json_utils.py
Successfully processed /home/kuba/Projects/github_search/org/tmp/f1.py
Successfully processed /home/kuba/Projects/github_search/org/.ipynb_checkpoints/promptify_runner-checkpoint.py
Successfully processed /home/kuba/Projects/github_search/org/pd_processing.py
Successfully processed /home/kuba/Projects/github_search/org/tmp/nbow_tasks.py
Successfully processed /home/kuba/Projects/github_search/org/.ipynb_checkpoints/promptify_utils-checkpoint.py
Building modules and classes: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 306/306 [00:07<00:00, 40.53it/s]
Building classes and methods: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 306/306 [00:00<00:00, 447.86it/s]
Building inherited methods: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 110/110 [00:00<00:00, 834.14it/s]
✍️ Shallow indexing (18 s)

Versions / Dependencies

Python 3.10.6 on Ubuntu

requirements.txt:

addict==2.4.0
aiohappyeyeballs==2.3.5
aiohttp==3.10.3
aiosignal==1.3.1
aliyun-python-sdk-core==2.15.1
aliyun-python-sdk-kms==2.16.3
altair==5.4.0
annotated-types==0.7.0
anyio==4.4.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
async-timeout==4.0.3
attrs==24.2.0
babel==2.16.0
backoff==2.2.1
beautifulsoup4==4.12.3
bleach==6.1.0
blinker==1.8.2
cachetools==5.4.0
certifi==2024.7.4
cffi==1.17.0
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
comm==0.2.2
contourpy==1.2.1
crcmod==1.7
cryptography==43.0.0
cycler==0.12.1
dashscope==1.20.3
dataclasses-json==0.6.7
datasets==2.20.0
debugpy==1.8.5
decorator==5.1.1
deepdiff==7.0.1
defusedxml==0.7.1
Deprecated==1.2.14
dill==0.3.8
dirtyjson==1.0.8
distro==1.9.0
einops==0.8.0
emoji==2.12.1
et-xmlfile==1.1.0
exceptiongroup==1.2.2
executing==2.0.1
faiss-cpu==1.8.0.post1
fasteners==0.19
fastjsonschema==2.20.0
filelock==3.15.4
filetype==1.2.0
fonttools==4.53.1
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2024.6.1
gitdb==4.0.11
GitPython==3.1.43
greenlet==3.0.3
grpcio==1.65.4
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
huggingface-hub==0.24.5
idna==3.7
iniconfig==2.0.0
interchange==2021.0.4
ipykernel==6.29.5
ipython==8.18.1
ipywidgets==8.1.3
isoduration==20.11.0
jedi==0.17.2
jieba==0.42.1
Jinja2==3.1.4
jiter==0.5.0
jmespath==0.10.0
joblib==1.4.2
json5==0.9.25
jsonpatch==1.33
jsonpath-python==1.0.6
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.2
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.4
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==3.0.11
kiwisolver==1.4.5
langchain==0.2.12
langchain-community==0.2.11
langchain-core==0.2.29
langchain-experimental==0.0.64
langchain-text-splitters==0.2.2
langdetect==1.0.9
langsmith==0.1.98
llama-cloud==0.0.13
llama-index==0.10.64
llama-index-agent-openai==0.2.9
llama-index-cli==0.1.13
llama-index-core==0.10.64
llama-index-embeddings-openai==0.1.11
llama-index-indices-managed-llama-cloud==0.2.7
llama-index-legacy==0.9.48
llama-index-llms-openai==0.1.29
llama-index-multi-modal-llms-openai==0.1.9
llama-index-program-openai==0.1.7
llama-index-question-gen-openai==0.1.3
llama-index-readers-file==0.1.33
llama-index-readers-json==0.1.5
llama-index-readers-llama-parse==0.1.6
llama-index-retrievers-bm25==0.1.5
llama-parse==0.4.9
lxml==5.3.0
markdown-it-py==3.0.0
MarkupSafe==2.1.5
marshmallow==3.21.3
matplotlib==3.9.1.post1
matplotlib-inline==0.1.7
mdurl==0.1.2
mistune==3.0.2
modelscope==1.17.1
-e git+https://github.com/modelscope/modelscope-agent/@b0143952ef5dbfd1c191e898c40899d416fcbb61#egg=modelscope_agent
monotonic==1.6
multidict==6.0.5
multiprocess==0.70.16
mypy-extensions==1.0.0
narwhals==1.3.0
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
networkx==3.2.1
nltk==3.8.2
notebook==7.2.1
notebook_shim==0.2.4
numpy==1.26.4
openai==1.40.3
opencv-python==4.10.0.84
openpyxl==3.1.5
ordered-set==4.1.0
orjson==3.10.7
oss2==2.18.6
overrides==7.7.0
packaging==24.1
pandas==2.2.2
pandocfilters==1.5.1
pansi==2020.7.3
parso==0.7.0
pdfminer.six==20240706
pexpect==4.9.0
pillow==10.4.0
platformdirs==4.2.2
pluggy==1.5.0
prometheus_client==0.20.0
prompt_toolkit==3.0.47
protobuf==5.27.3
psutil==6.0.0
ptyprocess==0.7.0
pure_eval==0.2.3
py2neo==2021.2.4
pyarrow==17.0.0
pyarrow-hotfix==0.6
pycparser==2.22
pycryptodome==3.20.0
pydantic==2.8.2
pydantic_core==2.20.1
pydeck==0.9.1
Pygments==2.18.0
pyparsing==3.1.2
pypdf==4.3.1
pytest==8.3.2
pytest-mock==3.14.0
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-iso639==2024.4.27
python-json-logger==2.0.7
python-magic==0.4.27
pytz==2024.1
PyYAML==6.0.2
pyzmq==26.1.0
qtconsole==5.5.2
QtPy==2.4.1
rank-bm25==0.2.2
rapidfuzz==3.9.6
referencing==0.35.1
regex==2024.7.24
requests==2.32.3
requests-toolbelt==1.0.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.7.1
rpds-py==0.20.0
safetensors==0.4.4
scipy==1.13.1
seaborn==0.13.2
Send2Trash==1.8.3
sentencepiece==0.2.0
simplejson==3.19.2
six==1.16.0
smmap==5.0.1
sniffio==1.3.1
sortedcontainers==2.4.0
soupsieve==2.5
SQLAlchemy==2.0.32
stack-data==0.6.3
streamlit==1.37.1
striprtf==0.0.26
tabulate==0.9.0
tenacity==8.5.0
terminado==0.18.1
tiktoken==0.7.0
tinycss2==1.3.0
tokenizers==0.19.1
toml==0.10.2
tomli==2.0.1
tornado==6.4.1
tqdm==4.66.5
traitlets==5.14.3
transformers==4.44.0
types-python-dateutil==2.9.0.20240316
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.1
unstructured==0.15.1
unstructured-client==0.25.4
uri-template==1.3.0
urllib3==2.2.2
watchdog==4.0.2
wcwidth==0.2.13
webcolors==24.8.0
webencodings==0.5.1
websocket-client==1.8.0
widgetsnbextension==4.0.11
wrapt==1.16.0
xxhash==3.4.1
yarl==1.9.4

Reproduction script

That would be pretty hard since everything happens in CodexGraph streamlit.

Issue Severity

High: It blocks me from completing my task.

laptype commented 1 month ago

The provided log output doesn't appear to show any issues. To further investigate the problem, it is recommended to try building a single file using the following command:

<PYTHON_ENV_PATH> modelscope_agent\environment\graph_database\indexer\run_index_single.py --file_path <FILE_PATH> --root_path <ROOT_PATH> --task_id <TASK_ID> --url <DATABASE_URL> --user <USERNAME> --password <PASSWORD> --db_name <DATABASE_NAME> --env <PYTHON_ENV_PATH> --shallow

Parameter explanations: <PYTHON_ENV_PATH>: The path to the Python environment (Python <= 3.9) where the required dependencies are installed. <FILE_PATH>: The path to the file that needs to be processed. <ROOT_PATH>: The root directory of the project, used to determine relative paths. <TASK_ID>: A unique ID to identify the task. <DATABASE_URL>: The URL of the Neo4j database, usually in the format bolt://<HOST>:<PORT>. <USERNAME>: The username to connect to the Neo4j database. <PASSWORD>: The password to connect to the Neo4j database. <DATABASE_NAME>: The name of the Neo4j database to use.

ccly1996 commented 1 month ago

Can codexgraph support locally deployed models?