aws / graph-explorer

React-based web application that enables users to visualize both property graph and RDF data and explore connections between data without having to write graph queries.
https://github.com/aws/graph-explorer
Apache License 2.0
317 stars 47 forks source link

[Bug] Unable to Synchronize Graph #219

Closed RonLek closed 7 months ago

RonLek commented 10 months ago

Community Note

Describe the bug I'm trying to visualize data from my Neptune DB using graph-explorer. I'm running graph-explorer in an EC2 instance within the same VPC as my Neptune cluster and creating an SSH tunnel from my local to port 8182 of this instance for connection. I can verify the connection by getting a response from https://<NEPTUNE_ENDPOINT>:8182/status from my browser.

However, the synchronization fails. The Network tab looks like the following

image

The first gremlin request gets a correct response but the second one fails with a 400. Additionally, the summary request fails due to a CORS error. Connectivity to the Neptune cluster is not a problem since the first gremlin request succeeds.

My graph has approximately 3,500 nodes and edges which I presume should not be large to cause timeouts.

Expected behavior A successful synchronization of the data.

RonLek commented 10 months ago

@triggan the setup in the readme didn't seem to work for me (prerequisites point to setting up SSH tunnel with EC2 instance here, however on visiting the EC2 public IP requests seem to hit localhost even with the "Using Proxy Server" box checked)) so I resorted to creating a direct tunnel with the EC2 instance.

The only thing that I can spot from the Network tab is a 400 bad request. I don't see anything new in the container logs.

image
triggan commented 10 months ago

You don't need an SSH tunnel to the EC2 instance if you're using Graph Explorer's proxy.

One thing to realize is that the queries used to fetch data into Graph Explorer are not coming from the Graph Explorer application. They are being triggered from the Graph Explorer code that is running in your browser. The proxy is to allow the requests from your browser to the Graph Explorer proxy on EC2, which then forwards the requests to Neptune. Graph Explorer does not have a middle-tier / API layer that sends requests to Neptune. All of the query requests go from the browser to Neptune (either directly, or via the proxy). This is why you will not see 400 errors in the container. The 400 errors would only appear in your browser's console.

RonLek commented 10 months ago

Thanks @triggan that clears it up. I was setting the proxy endpoint to https://localhost instead of the public ip of the EC2 instance. The summary request succeeds however I get a ERR_CONNECTION_RESET for the second gremlin query.

image

My EC2 instance can send and receive on ports 80, 443, 8182 and 22. I've also added the certificate to my Keychain. I'm not sure what's causing this.

Update: I think this is due to the size of the graph (~3.5k nodes). I tried reducing the gremlin query size and hitting the same endpoint and it did give out a valid response. Is there any way to go around this?

cubeddu commented 9 months ago

think this is due to the size of the graph (~3.5k nodes). I tried reducing the gremlin query size and hitting the same endpoint and it did give out a valid response. Is there any way to go around this?

Hi @RonLek, Thank you for reporting the issue! To help us triage it better, could you please provide some additional information:

This additional information will help us diagnose the problem and provide a solution more quickly.

xiazcy commented 7 months ago

I am closing this issue as it appears to be a configuration issue initially and the graph size issue should be fixed in https://github.com/aws/graph-explorer/issues/197 with versions 1.5.0+ as @cubeddu suggested. Please feel free to reopen if the issue persists after trying the new versions.