Open xpilasneo4j opened 5 months ago
Hi @xpilasneo4j can you elaborate what error got white trying S3 bucket?
I have this message even if I copy the keys from the Neo4j AWS Field Eng account
we will check the logs and fix the issue ASAP.
Hi @xpilasneo4j we debugged found that bucket URL is going without / ,and if user enters the spaces then it is failing we fixed these issues in the dev branch please try again and let use know.
Will try and let you know. Thanks
Did you update this URL: https://dev-frontend-dcavk67s4a-uc.a.run.app/ Because I tried and still the same error
It is working for us we will debug it with some other s3 bucket credentials
still seem to be getting this error. Running latest dev branch. try with and without trailing /
any ideas what I'm missing?
Cheers
If you don't mind sharing your credentials we will try and let you know what exactly happening
thansk, unfortunately not something I can share here, but you've just made me realise I can check logs running it locally.
It's throwing this error
An error occurred (SignatureDoesNotMatch) when calling the ListObjectsV2 operation: The request signature we calculated does not match the signature you provided. Check your key and signing method.
Are there specific permissions the user of the key needs? I've just given it readS3 to try and test it
Sorry I probably should have added that I'm not running the absolute latest dev branch, as it doesn't build in docker (which I see you have seen that issue so hopefully connected the dots)
I'm running this commit https://github.com/neo4j-labs/llm-graph-builder/commit/501ece4b57ce50a958e44d799303125395d02735 and having the error above. If it's fixed in latest dev all good, I'll just have to wait until that build issue is fixed.
Hi @fridaystreet can you try on latest DEV and let us know
@kartikpersistent thanks I'll give it a try today and report back
Looks like the actual connection is working now thanks. but I think maybe I'm just expecting to much from it. I'm trying to test if it is able to scan through some buckets we have of uploaded data of different types, being images, pdfs word documents etc, but I'm getting the following error. Exception: No pdf files found.
The files don't have extensions in the name but do have correct content type metadata. I'll do some more playing around. But I'd say in regards to this particular issue re connecting to s3, it's resolved, thanks
backend | 2024-10-22 22:39:55,370 - Use pytorch device_name: cpu backend | 2024-10-22 22:39:55,370 - Load pretrained SentenceTransformer: all-MiniLM-L6-v2 frontend | 192.168.65.1 - - [22/Oct/2024:22:39:56 +0000] "GET /service-worker-dev.js HTTP/1.1" 304 0 "http://localhost:8080/service-worker-dev.js" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/130.0.0.0 Safari/537.36" "-" backend | 2024-10-22 22:39:57,937 - Embedding: Using SentenceTransformer , Dimension:384 backend | 2024-10-22 22:39:57,937 - embedding model:client=SentenceTransformer( backend | (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel backend | (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) backend | (2): Normalize() backend | ) model_name='all-MiniLM-L6-v2' cache_folder=None model_kwargs={} encode_kwargs={} multi_process=False show_progress=False and dimesion:384 backend | 2024-10-22 22:39:58,122 - Enable Communities False backend | 2024-10-22 22:39:58,122 - Communities are disabled or GDS is not available in the database. backend | 2024-10-22 22:39:58,122 - Checking access for database: neo4j backend | 2024-10-22 22:39:58,365 - Read access count: 0 backend | 2024-10-22 22:39:58,366 - The account has write access. backend | 2024-10-22 22:39:58,379 - Get existing files list from graph backend | 2024-10-22 22:40:01,691 - closing connection for sources_list api backend | 2024-10-22 22:40:01,693 - Get existing files list from graph backend | 2024-10-22 22:41:16,210 - file_name : 05931530-7579-11eb-9abb-611bca4c3fa7 and file key : Domain::5fb497f5f1fa7800076a548c/05931530-7579-11eb-9abb-611bca4c3fa7 backend | 2024-10-22 22:41:16,210 - file_name : 212f6960-7579-11eb-9abb-611bca4c3fa7 and file key : Domain::5fb497f5f1fa7800076a548c/212f6960-7579-11eb-9abb-611bca4c3fa7 backend | 2024-10-22 22:41:16,210 - file_name : f37eed80-7599-11eb-9a6c-63fe91573d2c and file key : Domain::5fb497f5f1fa7800076a548c/f37eed80-7599-11eb-9a6c-63fe91573d2c backend | 2024-10-22 22:41:16,210 - file_name : 5a96eb60-66a1-11eb-9777-af61b5d771c0 and file key : Domain::5fb4c515f5a1c20007f94f73/5a96eb60-66a1-11eb-9777-af61b5d771c0 backend | 2024-10-22 22:41:16,211 - file_name : 5d82e7c0-66a1-11eb-9777-af61b5d771c0 and file key : Domain::5fb4c515f5a1c20007f94f73/5d82e7c0-66a1-11eb-9777-af61b5d771c0 backend | 2024-10-22 22:41:16,211 - file_name : 75956a40-2774-11ee-a086-d907b56d49e3 and file key : Domain::5fb4c515f5a1c20007f94f73/75956a40-2774-11ee-a086-d907b56d49e3 backend | 2024-10-22 22:41:16,211 - file_name : 7596f0e0-2774-11ee-a8dc-1f649a592958 and file key : Domain::5fb4c515f5a1c20007f94f73/7596f0e0-2774-11ee-a8dc-1f649a592958 backend | 2024-10-22 22:41:16,211 - file_name : undefined and file key : Domain::5fb4c515f5a1c20007f94f73/undefined backend | 2024-10-22 22:41:16,212 - Exception Stack trace: backend | Traceback (most recent call last): backend | File "/code/score.py", line 108, in create_source_knowledge_graph_url backend | lst_file_name,success_count,failed_count = await asyncio.to_thread(create_source_node_graph_url_s3,graph, model, source_url, aws_access_key_id, aws_secret_access_key, source_type backend | File "/usr/local/lib/python3.10/asyncio/threads.py", line 25, in to_thread backend | return await loop.run_in_executor(None, func_call) backend | File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run backend | result = self.fn(*self.args, **self.kwargs) backend | File "/code/src/main.py", line 45, in create_source_node_graph_url_s3 backend | raise Exception('No pdf files found.') backend | Exception: No pdf files found.
Although, the above error still returns the original invalid credentials error to the front end which is a bit confusing, possibly if it could return the correct error that might help save some time if people are having issues.
Just for anyone else getting here and apologies if this was obvious somewhere and I've missed it. It only appears to scan pdf files in the s3 bucket and they must have .pdf extension in the name, it's not picking up from the content-type.
@aashipandya
we only extract PDF files from the bucket
I created a S3 bucket on the Field-Engineering-Pro-Services AWS account and using the AWS access and secret keys, I can't connect the website to my bucket s3://nasa-lessons-learned-files