Closed fahdmirza closed 1 week ago
hey @fahdmirza - thanks for flagging, I will look into reproducing, my guess is that the release has fallen slightly out of date.
tested this today, the issue is solved with the latest image, can you confirm?
Now it has this error:
(r2r) Ubuntu@0068-kci-prxmx10127:~/R2R$ python3 -m r2r.examples.quickstart ingest_as_files --no-media=true --config_name=local_ollama 2024-06-23 21:22:45,567 - INFO - r2r.core.providers.vector_db_provider - Initializing VectorDBProvider with config extra_fields={} provider='pgvector' collection_name='demo_vecs'. 2024-06-23 21:22:45,624 - INFO - r2r.core.providers.embedding_provider - Initializing EmbeddingProvider with config extra_fields={'text_splitter': {'type': 'recursive_character', 'chunk_size': 512, 'chunk_overlap': 20}} provider='ollama' base_model='mxbai-embed-large' base_dimension=1024 rerank_model=None rerank_dimension=None rerank_transformer_type=None batch_size=32. 2024-06-23 21:22:46,622 - INFO - r2r.core.providers.llm_provider - Initializing LLM provider with config: extra_fields={} provider='litellm' R2RApp.init, config = <r2r.main.assembly.config.R2RConfig object at 0x7b950ee10e50> ERROR: Could not consume arg: ingest_as_files Usage: quickstart.py ingest_as_files - <group|command> available groups: USER_IDS | default_files | file_tuples | r2r_app | user_ids available commands: analytics | app_settings | delete | document_chunks | documents_overview | evaluate | ingest_documents | ingest_files | logs | rag | search | serve | update_documents | update_files | users_overview
For detailed information on this command, run: quickstart.py ingest_as_files - --help
Hey Fahd,
Can you confirm whether or not you are still seeing issues after the latest docker has been published today?
I am trying to install this R2R with Ollama locally and following this document : https://r2r-docs.sciphi.ai/cookbooks/local-rag
Could you confirm if this document is up to date and correct because even when followed to the dot it gives errors?
Do we need to clone the git repo to get it working with Ollama?
I already have ollama running on my system.
conda create -n r2r python=3.11 -y && conda activate r2r pip install 'r2r[all]' pip install 'r2r[local-embedding]'
mkdir r2r cd R2R touch local_ollama
-- and then pasted below config in local_ollama file:
{ "embedding": { "provider": "sentence-transformers", "base_model": "all-MiniLM-L6-v2", "base_dimension": 384, "batch_size": 32 }, "eval": { "provider": "local", "frequency": 0.0, "llm":{ "provider": "litellm" } }, "ingestion":{ "excluded_parsers": { "gif": "default", "jpeg": "default", "jpg": "default", "png": "default", "svg": "default", "mp3": "default", "mp4": "default" } } }
and then I ran following command:
python -m r2r.examples.quickstart ingest_as_files --no-media=true --config_name=local_ollama
extractions in t=5.57 seconds. - 2024-06-21 07:39:39,108 r2r.pipes.embedding_pipe - INFO - Fragmented the input document ids into counts as shown: {UUID('f0c63aff-af59-50c9-81fc-2fe55004c771'): 17, UUID('c9bdbac7-0ea3-5c9e-b590-018bd09b127b'): 233, UUID('b722f1ec-b90e-5ed8-b7c8-c768e8b323cb'): 5, UUID('c996e617-88a4-5c65-ab1e-948344b18d27'): 3108, UUID('ba77307d-6c8a-549f-812a-3558697e2842'): 23, UUID('4a4fb848-fc03-5487-a7e5-33c9fdfb73cc'): 31, UUID('1a9d4d3b-bbe9-53b9-8149-67806bdf60f2'): 18, UUID('ef66e5dd-2130-5fd5-9bdd-aa7eff59fda5'): 11, UUID('c5abc0b7-b9e5-54d9-b3d3-fdb14af4d065'): 2094} - 2024-06-21 07:39:40,005 Time taken to ingest files: 31.94 seconds {'processed_documents': ["File 'got.txt' processed successfully.", "File 'aristotle.txt' processed successfully.", "File 'pg_essay_1.html' processed successfully.", "File 'pg_essay_2.html' processed successfully.", "File 'pg_essay_3.html' processed successfully.", "File 'pg_essay_4.html' processed successfully.", "File 'pg_essay_5.html' processed successfully.", "File 'lyft_2021.pdf' processed successfully.", "File 'uber_2021.pdf' processed successfully."], 'skipped_documents': []} ERROR: Could not consume arg: --config_name=local_ollama Usage: quickstart.py ingest_as_files --no-media=true -