NASA-PDS / registry-api

Web API service for the PDS Registry, providing the implementation of the PDS Search API (https://github.com/nasa-pds/pds-api) for the PDS Registry.
https://nasa-pds.github.io/pds-api
Apache License 2.0
2 stars 5 forks source link

Keyword fields do not support `like` queries #351

Open jordanpadams opened 1 year ago

jordanpadams commented 1 year ago

Checked for duplicates

Yes - I've already checked

πŸ› Describe the bug

When I tried to perform a query ((x like "foo*") and (y eq "bar")), I noticed it throws a 403 error

πŸ•΅οΈ Expected behavior

I expected the query to work successfully

πŸ“œ To Reproduce

This doesn't work:

$ curl --get 'https://pds.nasa.gov/api/search-en/1/products' \
  --data-urlencode 'limit=100' \
  --data-urlencode 'q=((lid like "urn:nasa:pds:voyager2.pws.wf*") and (product_class eq "Product_Observational"))' \
  --data-urlencode 'start=0' \
  -H 'accept: application/json'

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The request could not be satisfied</TITLE>
</HEAD><BODY>
<H1>403 ERROR</H1>
<H2>The request could not be satisfied.</H2>
<HR noshade size="1px">
Bad request.
We can't connect to the server for this app or website at this time. There might be too much traffic or a configuration error. Try again later, or contact the app or website owner.
<BR clear="all">
If you provide content to customers through CloudFront, you can find steps to troubleshoot and help prevent this error by reviewing the CloudFront documentation.
<BR clear="all">
<HR noshade size="1px">
<PRE>
Generated by cloudfront (CloudFront)
Request ID: TnGBtCn-K4GE39L6_pWPUDs286AiH6T3m8hSdJ7oQ2nAamcU_EbtRQ==
</PRE>
<ADDRESS>
</ADDRESS>

This does:

$ curl --get 'https://pds.nasa.gov/api/search-en/1/products' \
  --data-urlencode 'limit=100' \
  --data-urlencode 'q=((lid eq "urn:nasa:pds:voyager2.pws.wf") and (product_class eq "Product_Bundle"))' \
  --data-urlencode 'start=0' \
  -H 'accept: application/json'

πŸ–₯ Environment Info

Registry API 1.2

πŸ“š Version of Software Used

No response

🩺 Test Data / Additional context

No response

πŸ¦„ Related requirements

No response

βš™οΈ Engineering Details

No response

al-niessner commented 10 months ago

@jordanpadams @tloubrieu-jpl

What shows up in the logs? 403 is not one of registry-api errors. registry-api throws 400, 404, 406, and 500. It probably means it is coming from spring but need log message for it. I can run the query and look at the log if I had access. I think this requires special magic to get the logs from amazon after it is run and I do not have the privs. Run the query and get the full log from start of query to end so that we can see what is generating a 403.

al-niessner commented 10 months ago

@jordanpadams @tloubrieu-jpl

I wonder if opensearch is returning the 403. Now that would be weird but need the logs files from aws.

alexdunnjpl commented 10 months ago

Just in case it's slipped notice, that's a weird error - HTTP403 (Forbidden/Unauthorized), but the text states Bad request

al-niessner commented 10 months ago

@alexdunnjpl

The other really odd bit is that it is generated by cloudfront -- did we add that to our errors to make it more confusing - and that we ask for JSON which our errors would be returned in that format. We catch nearly all exceptions and stuff in the controller transmutter, unless that changed, and looking there we cannot generate the 403. It looks like amazon likes eq in a url but not like. Ah, such fun.

alexdunnjpl commented 10 months ago

@al-niessner I don't recall any changes to exception handling and am 95% sure of that recollection.

I'm more suspicious of the * than the eq/like difference, but that's just a gut hunch.

Let me test something real quick.

alexdunnjpl commented 10 months ago

I take that back - query q=((lid like "urn:nasa:pds:voyager2.pws.wf") and (product_class eq "Product_Bundle")) (working query, subbing in like for eq) fails, and asterisk does not cause failure.

al-niessner commented 9 months ago

@alexdunnjpl @jordanpadams @tloubrieu-jpl

Again, this is not a problem with registry-api but is a problem before you get to registry-api. Need to look at the logs while it is failing or understand the message better (see cloudfront expert). I did notice from other ticket that this is using search-en so changed to search but got same error, When I run locally:

$ curl --get 'http://localhost:8080/products'   --data-urlencode 'limit=100'   --data-urlencode 'q=((lid like "urn:nasa:pds:voyager2.pws.wf*") and (product_class eq "Product_Observational"))'   --data-urlencode 'start=0'   -H 'accept: application/json'
{"summary":{"q":"((lid like \"urn:nasa:pds:voyager2.pws.wf*\") and (product_class eq \"Product_Observational\"))","hits":0,"took":29,"search_after":[],"limit":100,"sort":[],"properties":[]},"data":[]}

which means not registry-api code.

tloubrieu-jpl commented 8 months ago

The idea is to index the fields as text and keyword.