[Bug] Unable to Synchronize Graph

RonLek commented 10 months ago

Community Note

Please use a 👍 reaction to provide a +1/vote. This helps the community and maintainers prioritize this request.
If you are interested in working on this issue or have submitted a pull request, please leave a comment.

Describe the bug I'm trying to visualize data from my Neptune DB using graph-explorer. I'm running graph-explorer in an EC2 instance within the same VPC as my Neptune cluster and creating an SSH tunnel from my local to port 8182 of this instance for connection. I can verify the connection by getting a response from https://<NEPTUNE_ENDPOINT>:8182/status from my browser.

However, the synchronization fails. The Network tab looks like the following

The first gremlin request gets a correct response but the second one fails with a 400. Additionally, the summary request fails due to a CORS error. Connectivity to the Neptune cluster is not a problem since the first gremlin request succeeds.

My graph has approximately 3,500 nodes and edges which I presume should not be large to cause timeouts.

OS: Amazon Linux 2
Browser: Google Chrome
Graph Notebook Version: Built from current source
Graph Database & Version: Amazon Neptune 1.2.1.0

Expected behavior A successful synchronization of the data.

RonLek commented 10 months ago

@triggan the setup in the readme didn't seem to work for me (prerequisites point to setting up SSH tunnel with EC2 instance here, however on visiting the EC2 public IP requests seem to hit localhost even with the "Using Proxy Server" box checked)) so I resorted to creating a direct tunnel with the EC2 instance.

The only thing that I can spot from the Network tab is a 400 bad request. I don't see anything new in the container logs.

triggan commented 10 months ago

You don't need an SSH tunnel to the EC2 instance if you're using Graph Explorer's proxy.

One thing to realize is that the queries used to fetch data into Graph Explorer are not coming from the Graph Explorer application. They are being triggered from the Graph Explorer code that is running in your browser. The proxy is to allow the requests from your browser to the Graph Explorer proxy on EC2, which then forwards the requests to Neptune. Graph Explorer does not have a middle-tier / API layer that sends requests to Neptune. All of the query requests go from the browser to Neptune (either directly, or via the proxy). This is why you will not see 400 errors in the container. The 400 errors would only appear in your browser's console.

RonLek commented 10 months ago

Thanks @triggan that clears it up. I was setting the proxy endpoint to https://localhost instead of the public ip of the EC2 instance. The summary request succeeds however I get a ERR_CONNECTION_RESET for the second gremlin query.

My EC2 instance can send and receive on ports 80, 443, 8182 and 22. I've also added the certificate to my Keychain. I'm not sure what's causing this.

Update: I think this is due to the size of the graph (~3.5k nodes). I tried reducing the gremlin query size and hitting the same endpoint and it did give out a valid response. Is there any way to go around this?

cubeddu commented 9 months ago

think this is due to the size of the graph (~3.5k nodes). I tried reducing the gremlin query size and hitting the same endpoint and it did give out a valid response. Is there any way to go around this?

Hi @RonLek, Thank you for reporting the issue! To help us triage it better, could you please provide some additional information:

API Response: When you click on the column name on the ?gremlin=g.V() fetch call, you'll see tabs on the right side. Please click the "Response" tab and share any relevant information you find there.
Label Data: How are your labels set up? Do you have any long string labels? If so, you might be experiencing a known issue that's currently being addressed in a PR (see here: Issue, PR).

This additional information will help us diagnose the problem and provide a solution more quickly.

xiazcy commented 7 months ago

I am closing this issue as it appears to be a configuration issue initially and the graph size issue should be fixed in https://github.com/aws/graph-explorer/issues/197 with versions 1.5.0+ as @cubeddu suggested. Please feel free to reopen if the issue persists after trying the new versions.

aws / graph-explorer

[Bug] Unable to Synchronize Graph #219