vectara / vectara-ingest

An open source framework to crawl data sources and ingest into Vectara
https://vectara.com
Apache License 2.0
147 stars 50 forks source link

ModuleNotFoundError: no module named 'yaml' when running crawler #110

Closed swarajban closed 2 months ago

swarajban commented 2 months ago

Seeing this error when running run.sh

bash run.sh config/vectara-notion.yaml default
Building for amd64 with buildx
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'yaml'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'yaml'
[+] Building 14.5s (8/18)                    

From docker container logs (that immediately exited after building with EC 1)

2024-08-15 16:29:21 2024-08-15 23:29:21,317 - root - INFO - Starting the Crawler...
2024-08-15 16:29:21 INFO:root:Starting the Crawler...
2024-08-15 16:29:21 Traceback (most recent call last):
2024-08-15 16:29:21   File "/usr/local/lib/python3.10/dist-packages/toml/decoder.py", line 511, in loads
2024-08-15 16:29:21     ret = decoder.load_line(line, currentlevel, multikey,
2024-08-15 16:29:21   File "/usr/local/lib/python3.10/dist-packages/toml/decoder.py", line 737, in load_line
2024-08-15 16:29:21     raise ValueError("Invalid date or number")
2024-08-15 16:29:21 ValueError: Invalid date or number
2024-08-15 16:29:21 
2024-08-15 16:29:21 During handling of the above exception, another exception occurred:
2024-08-15 16:29:21 
2024-08-15 16:29:21 Traceback (most recent call last):
2024-08-15 16:29:21   File "/home/vectara/ingest.py", line 164, in <module>
2024-08-15 16:29:21     main()
2024-08-15 16:29:21   File "/home/vectara/ingest.py", line 92, in main
2024-08-15 16:29:21     env_dict = toml.load(f)
2024-08-15 16:29:21   File "/usr/local/lib/python3.10/dist-packages/toml/decoder.py", line 156, in load
2024-08-15 16:29:21     return loads(f.read(), _dict, decoder)
2024-08-15 16:29:21   File "/usr/local/lib/python3.10/dist-packages/toml/decoder.py", line 514, in loads
2024-08-15 16:29:21     raise TomlDecodeError(str(err), original, pos)
2024-08-15 16:29:21 toml.decoder.TomlDecodeError: Invalid date or number (line 4 column 1 char 21)
swarajban commented 2 months ago

Ah needed to pip install the yaml lib, this issue is solved