Closed gothub closed 2 years ago
There's a long history with this regarding the HTTP Expect header. I've mentioned this a few times and @csjx and I have spent a lot of time looking into this. I'll go find those issues and link them here.
https://redmine.dataone.org/issues/7822 https://redmine.dataone.org/issues/2693
curl
R package followed thisI don't consider this an rdataone issue.
@amoeba we saw this error again. I'm wondering if we can file an issue in the curl
library about this? I don't know enough about it to understand but it sounds like it may be a problem on their end rather than ours?
In any case, I'm very sick of this error
We basically need to get rid of client side certificates. Fixing this means changing the way authentication works in DataONE overall. @csjx put together a proposal that is being evaluated now. But its not going to happen fast as it impacts all 50+ member nodes.
That said, if we change the R client to only use JWT tokens and drop support for certificates in R, then we might modify the Arctic Data Center and KNB to no longer use client side certs except for the CN to MN and MN to MN communication. This is something the devteam should discuss, and if it seems like it could work, then we could propose a fix. I think it would involve configuring apache to only verify client certs when the client is in a known list of CNs and MNs. @gothub do you see any reason the R client would need to continue to support certificates, or could we remove that fully?
How to we determine that no R user is creating and using certificates? If no users are using certs, then cert handling could be removed from the R client.
Even if users are using certs, they can change to using tokens without any code changes to their programs. So, I think it would be ok either way.
Thanks for bringing this back up @jeanetteclark. I'd like to see this fixed too. I've talked with the creator of the crul
package about making a change for our use case and he preferred to follow the upstream library (libcurl
) instead (which makes total sense to me).
@mbjones, @gothub I just did a quick test of Matt's idea and it appears like we can make a minor tweak to our Apache configs to get this to work.
I logged into test.arcticdata.io to see what our Apache config looks like,
SSLVerifyClient none
<If " ! ( %{HTTP_USER_AGENT} =~ /(windows|chrome|mozilla|safari|webkit)/i )">
SSLVerifyClient optional
</If>
SSLVerifyDepth 10
which matched my memory of it. I decided to use httr
to test the idea that, if we could set SSLVerifyClient
to none
for the R client that we'd see this error go away. I decided to just spoof the user agent to trigger Apache to not SSLClientVerify
by setting my UA to safari
(see second test).
First test: Confirm failure state Apache config as-is, curl 3.2 Result: Failure (Error 413) Partial code sample:
httr::POST("https://test.arcticdata.io/metacat/d1/mn/v2/object",
body = list(pid = pid,
sysmeta = upload_file(sm_path),
object = upload_file(file_path)),
add_headers(Authorization = paste("Bearer", getOption("dataone_test_token"))))
Second test: Fake UA to confirm workaround with user_agent("safari")
Apache config as-is, curl 3.2
Result: Success (http 200)
Partial code sample:
httr::POST("https://test.arcticdata.io/metacat/d1/mn/v2/object",
body = list(pid = pid,
sysmeta = upload_file(sm_path),
object = upload_file(file_path)),
add_headers(Authorization = paste("Bearer", getOption("dataone_test_token"))),
user_agent("safari"))
A bit more testing / discussing might be appropriate to get a path scoped out but I think this shows promise. The R dataone package already sets a custom UA so we could just add that string to the SSLClientVerify
whitelist we already have:
Does this sound like a doable short-term fix until we change how our DataONE infrastructure handles client certs?
That sounds like a great suggestion to me, as it limits impact to the R client and NCEAS-managed data repositories, and gives us time to work out the larger issues with the DataONE authentication process. So, I'd say let's test exclusion of the R client user agent, and then deploy on all of our managed MNs (KNB, ADC, ...).
Had a great chat on our dev call yesterday. Notes here. TLDR: We can fix this fast.
I sent an email to dev@nceas
:
Hey all, most of us were on the dev call yesterday but Chris made a good point that emailing the dev list with a summary plus action items would be good for visibility. Please take a look through to see if we're missing anything.
What's the R & HTTP 413 problem?
See https://github.com/DataONEorg/rdataone/issues/210 for more detail. A short summary is that createObject/updateObject calls began failing a while ago for users of the R dataone package because of a change in the underlying HTTP library it uses (deprecation of the Expect HTTP header) which was incompatible with our support of client certificates in the Apache configs we put in front of Metacat (to support client certs).
What are we doing now?
We're version-pinning our R users to an old version of the curl package. This is super painful and likely insecure.
What are we going to do right now to fix this for our users?
- Add the string 'httr' to our SSLClientVerify whitelist in our Apache configs on all Metacat installations we manage. The dataone R package sends a UA that includes the strings 'dataone', 'httr', and 'R' (plus version strings). We decided on adding 'httr' to the whitelist so (1) both users of rdataone and httr (a common http lib for R) get the fix and (2) we reduce the potential for breakage in other libraries that use 'dataone' in their UA string (Python, Java)
- Update the Metacat documentation (probably the DataONE section) with this information
- Update our boilerplate Apache config we ship with Metacat
- Coordinate with DataONE on this work
I'll be filing Issues in relevant repos shortly but please look for those. Thanks!
Have you considered basing the SSLClientVerify
setting on the presence of the Authorization
header?
That is, set SSLClientVerify none
if the client passes a JWT.
That seems like a really good idea @rogerdahl. It's a bit more direct and would help the R client and any other non-browser clients that send JWTs w/o the Expect
header. I think we'd still need the existing Apache config portion so browsers, like Safari, don't unnecessarily ask the user for a client cert.
Maybe something like this?
<If "%{HTTP:Authorization} =~ /\ABearer .+">
SSLClientVerify none
</If>
I haven't seen this error in a long time. Did changes to other underlying libraries (ours or others) render this not a problem anymore? @amoeba if you have any idea, let me know, otherwise I'm going to close and will reopen if we see it again
I haven't seen this in a while either and I think underlying libraries must have changed. Our web server configs on sites like the KNB or ADC don't even have the workarounds we put in to deal with this any longer. I'm in favor of closing the issue.
Reported by @maier-m - this occurs intermittently during
createUpdate()
for large or small objects, as small as 303KB:Downgrading curl to 2.7.1 can fix the problem
Determine what the underlying problem with the curl package/library is.