Open jordanpadams opened 2 months ago
Here are the 100 requests which came to the API when the instability was noticed: 7a541e10-8113-44f6-9bd1-f58951c81317.csv
the first errors came up after limit parameter was set to 10000 which can work, and worked in later tests but is above what we would expect (a few 100s).
The issue might also be related to simultaneous activities on the registry (e.g. sweepers).
Similar issue identified by @anilnatha attempting the following query:
https://pds.nasa.gov/api/search/1/products?q=(product_class eq "Product_Context" and lid like "urn:nasa:pds:context:instrument_host:*")&limit=9999&fields=lid,vid,pds:Instrument_Host.pds:description,pds:Instrument_Host.pds:name,pds:Instrument_Host.pds:type
@tloubrieu-jpl @alexdunnjpl for this issue, if we know the page size is too large, is there any way we can improve the error messaging that comes with these errors? Or is this a server timeout thing?
@jordanpadams server timeout if it's what I think it is - basically the request is valid and reasonable prima facie, but then the volume (size, not count) of the data ends up taking too long to serve so the server calls it quits.
This could be confirmed by repeating the queries, limiting the requested fields to lidvid
to ensure no large data volume is served, assuming the issues aren't so sporadic that you're unable to convince yourself that it's working without issue after some reasonable period of testing.
@alexdunnjpl copy. is there any way we can do some other smart things on the API side to either keep the connection alive with the server?
@jordanpadams I wouldn't think so (without a box full of bandaids or increasing timeout thresholds) - from my perspective it's on the client to respond appropriately to a 504 (by, for example, retrying with a smaller request)
I haven't dug into what's going on here, though - I'm making some assumptions.
thanks @alexdunnjpl we will dig a bit further in the future to see what we can do. The problem here being with an internet browser and curl as clients, there is nothing to do there in terms of responding to this appropriately. We may just need to update the documentation to explicitly call out these errors.
Checked for duplicates
Yes - I've already checked
🐛 Describe the bug
When I tried TBD queries, I was getting sporadic 504 errors.
🕵️ Expected behavior
I expected the API to work
📜 To Reproduce
TBD queries
See CloudFront and/or Registry logs.
🖥 Environment Info
latest deployed
📚 Version of Software Used
No response
🩺 Test Data / Additional context
No response
🦄 Related requirements
No response
⚙️ Engineering Details
No response