aws / graph-explorer

React-based web application that enables users to visualize both property graph and RDF data and explore connections between data without having to write graph queries.
https://github.com/aws/graph-explorer
Apache License 2.0
300 stars 46 forks source link

[Bug] 413 Payload Too Large #410

Open guyelia opened 1 month ago

guyelia commented 1 month ago

Description

When syncing a AWS Neptune instance with a pretty small dataset, I'm getting the following error in the developer console: POST https://PROXY_PUB_IP/gremlin 413 (Payload Too Large) looks like it is coming from the Express: X-Powered-By: Express

the requested payload is 369KB when I'm pretty sure Express's default limit is 1mb

Environment

[!IMPORTANT] If you are interested in working on this issue or have submitted a pull request, please leave a comment.

[!TIP] Please use a 👍 reaction to provide a +1/vote.

This helps the community and maintainers prioritize this request.

kmcginnes commented 1 month ago

Thanks for the submission @guyelia.

I'll dig around a bit to see if this is a known issue.

I'd like to get a bit more context from you about the issue.

Also, can you post the Gremlin query that caused the error? You can get that by:

  1. Open the developer console in the web browser
  2. Go to the network tab
  3. Click the "synchronize database" button in the app UI
  4. Wait for the request to fail
  5. Select the failed request
  6. Select the "Payload" tab
  7. Right click on the query and select "copy value"
  8. Paste the value in a code block here on the GitHub issue

IMPORTANT: Don't forget to scrub any private info out of the query before posting it.

Screenshot 2024-05-20 at 10 14 33 AM
guyelia commented 1 month ago

Thanks for the submission @guyelia.

I'll dig around a bit to see if this is a known issue.

I'd like to get a bit more context from you about the issue.

  • How many node types do you have (i.e. labels)?
  • Do you have node or edge types with a lot of attributes (i.e. 10 or more)?

Also, can you post the Gremlin query that caused the error? You can get that by:

  1. Open the developer console in the web browser
  2. Go to the network tab
  3. Click the "synchronize database" button in the app UI
  4. Wait for the request to fail
  5. Select the failed request
  6. Select the "Payload" tab
  7. Right click on the query and select "copy value"
  8. Paste the value in a code block here on the GitHub issue

IMPORTANT: Don't forget to scrub any private info out of the query before posting it.

Screenshot 2024-05-20 at 10 14 33 AM

Hey @kmcginnes , thanks for the replay, and sorry for the delay.

I've used gremlin-console to answer your questions, please LMK if i missed something:

regarding the Gremlin query itself, it gonna be problematic to post it because it contains sensitive data mostly, but it is a huge command built from "g.V().project(ALL_NODES).by(V().hasLabel(NODE).limit(1)).by(V().hasLabel(ANOTHER_NODE).limit(1))...

it looks like a nodejs express limitation of 100k per request, any objection to increasing it to something bigger? (1MB?)

kmcginnes commented 1 month ago

@guyelia Thank you. That is perfect!

If you need a fix quickly, then definitely fork this project and increase the limit.

We are focusing on db query performance now and I'm going to consider the increase as a potential solution. But this is a bandaid fix and will just kick the can down the road to some future user who needs the request size to be even bigger.

I would really love to find a better way to construct the query so that it isn't so large. Or perform some batching for larger databases. So I don't want to lean on the request size increase if I can help it.

guyelia commented 1 month ago

@guyelia Thank you. That is perfect!

If you need a fix quickly, then definitely fork this project and increase the limit.

We are focusing on db query performance now and I'm going to consider the increase as a potential solution. But this is a bandaid fix and will just kick the can down the road to some future user who needs the request size to be even bigger.

I would really love to find a better way to construct the query so that it isn't so large. Or perform some batching for larger databases. So I don't want to lean on the request size increase if I can help it.

sure thing, thanks! so, just for FYI and if anyone else encountered the same issue - I've built the graph-explorer locally with two additional lines to packages/graph-explorer-proxy-server/node-server.js:

app.use(bodyParser.json({ limit: '50mb' })); // Increase the payload size limit
app.use(bodyParser.urlencoded({ limit: '50mb', extended: true })); // Increase the payload size limit

and problem solved :)