DataONEorg / dataone

DataONE information and general-purpose issue tracking
Apache License 2.0
2 stars 0 forks source link

Repeated encoded slashes not being handled on search.dataone.org web server config #11

Closed amoeba closed 3 years ago

amoeba commented 3 years ago

This URL 404s with a System Metadata not found error: https://search.dataone.org/cn/v2/views/metacatui/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fedi%2F248%2F2.

If you change the host portion to cn.dataone.org (https://cn.dataone.org/cn/v2/views/metacatui/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fedi%2F252%2F3), you get the correct response.

Looking at the error message, you can see that the PID it reports doesn't match what's in the URL and the difference is that it's missing the second slash after "http": https:/pasta.lternet.edu/package/metadata/eml/edi/252/3 when it should be https://pasta.lternet.edu/package/metadata/eml/edi/252/3. That the same request works when the host is cn.dataone.org suggests we don't have a Metacat bug and that we probably have a web server configuration issue specific to search.dataone.org.

I don't think I have any privs to look at that host. @datadavev or @taojing2002 could either of you take a look? This issue appears to be affect all objects with similar identifiers and results in the fallback, index-powered metadata display which is less than desirable.

datadavev commented 3 years ago

The nocanon option needs to be added to the ProxyPassMatch config. e.g. This works in stage:

        ProxyPassMatch "^/cn/v2/(.*)" "https://cn-stage.test.dataone.org/cn/v2/$1" nocanon
        ProxyPassReverse "/cn/v2" "https://cn-stage.test.dataone.org/cn/v2/"
        ProxyPassMatch "^/cn/v1/(.*)" "https://cn-stage.test.dataone.org/cn/v1/$1" nocanon
        ProxyPassReverse "/cn/v1" "https://cn-stage.test.dataone.org/cn/v1/"

Test case: https://search-stage.test.dataone.org/cn/v2/views/metacatui/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-arc%2F20036%2F8

datadavev commented 3 years ago

Adding the nocanon option resolved the issue in production.