alexcrichton / curl-rust

Rust bindings to libcurl
MIT License
1.02k stars 233 forks source link

Add support for in-memory certificates #384

Closed kulshrax closed 3 years ago

kulshrax commented 3 years ago

This adds support for passing SSL certificates as in-memory blobs using several new options added in libcurl 7.71.0:

Notably, this PR does not add support for CURLOPT_PROXY_ISSUERCERT_BLOB since this crate does not presently support CURLOPT_PROXY_ISSUERCERT (and adding support for that is somewhat orthogonal to this PR).

In order to implement the above, this PR also adds support for the new CURLOPTTYPE_BLOB option type (and the corresponding Curl_setblobopt setter function). Although the C API allows the caller to specify whether the data should be copied or not, the safe interface here always copies the data since there doesn't seem to be a straightforward way to make the lifetimes work out otherwise.

I'm not sure how to add automated tests for this since this crate's test suite currently doesn't seem to be able to test SSL at all. I've been able to verify that this works in small test programs on Linux/OpenSSL using P12 certificates, but have not tested on other platforms using other certificate formats and SSL engines.

alexcrichton commented 3 years ago

Thanks for this! This all looks reasonable to me, and I think it's ok if you've tested locally to not worry too much about adding tests to CI. It does look like there are some CI failures, although they're likely unrelated to this PR. Would you be up for helping to investigate them though?

kulshrax commented 3 years ago

Sure, I'd be happy to look into the CI failures. TBH, I starting looking into them yesterday, but I'm pretty new to working with Github's CI, so I didn't get very far. Looks like curl-sys's build script is failing to find some header files from the curl submodule, which makes me wonder if this is some kind of submodule issue.

In any case, any tips on where to start investigating would be helpful and I can take it from there!

alexcrichton commented 3 years ago

Oh actually the failing line is this one -- https://github.com/alexcrichton/curl-rust/blob/64e8fc2f50bbc4a88c62bf6a876f2ca4c63e4d15/curl-sys/build.rs#L87-L91 -- which I think is happening because the submodule was updated and some files have probably been moved around in the latest version?

kulshrax commented 3 years ago

Updated the commit to fix the submodule version.

sagebind commented 3 years ago

I believe one of the errors is due to a mismatch between the curl_blob definition in curl-sys vs the C one, which systest is trying to validate. C defines the data field as void*, but the Rust definition is *const c_void. I believe it shoud be changed to *mut c_void to be correct.

kulshrax commented 3 years ago

Hm, looks like you're right; I'll change it to *mut c_void. It's a bit unfortunate since this means that the Rust interface would need to take an &mut [u8], even though (as far as I can tell) libcurl simply uses the pointer to read the data. I guess we could avoid the need for an exclusive reference by copying the data twice (once into a local buffer and then again by libcurl), but that doesn't seem great either.

sagebind commented 3 years ago

Since the pointer is being passed across FFI into C, as long as we aren't mutating the reference from the Rust side, and as long as libcurl isn't mutating it, I think it would be safe to cast the *const c_void into a *mut c_void.

sagebind commented 3 years ago

Alternatively, for a less risky approach, we could take an &[u8], copy it into a Vec<u8> as you said, but to avoid an extra copy store the bytes inside our Easy2 struct and pass the CURL_BLOB_NOCOPY flag instead so that we still only do one copy, but on the Rust side, and we just make sure to keep the bytes alive for the duration of the easy handle. But that's a lot more tedious and I'm pretty sure the *const to *mut cast is sound here.

sagebind commented 3 years ago

Actually now that I think about it, that might be the way to support zero-copy certs which might be nice. As you said in the PR description:

Although the C API allows the caller to specify whether the data should be copied or not, the safe interface here always copies the data since there doesn't seem to be a straightforward way to make the lifetimes work out otherwise.

Just let the handle take ownership of the data and allow the user to either pass borrowed bytes (in which case we do a copy), or owned bytes, in which case we don't do any copies. Maybe something like this:

    pub fn proxy_sslcert_blob<B: Into<Vec<u8>>>(&mut self, blob: B) -> Result<(), Error> {
        self.proxy_sslcert_blob = Some(blob.into());
        unsafe {
            self.setopt_blob(curl_sys::CURLOPT_PROXY_SSLCERT_BLOB, self.proxy_sslcert_blob.as_deref().unwrap(), false)
        }
    }

    unsafe fn setopt_blob(&mut self, opt: curl_sys::CURLoption, val: &[u8], copy: bool) -> Result<(), Error> {
        let blob = curl_sys::curl_blob {
            data: val.as_ptr() as *const c_void as *mut c_void,
            len: val.len(),
            flags: if copy {
                curl_sys::CURL_BLOB_COPY
            } else {
                curl_sys::CURL_BLOB_NOCOPY
            },
        };
        let blob_ptr = &blob as *const curl_sys::curl_blob;
        self.cvt(curl_sys::curl_easy_setopt(self.inner.handle, opt, blob_ptr))
    }

Still tedious since it means we have to keep track of every blob given, but it would allow setting certificates without making any copies. If a &[u8] is given, it is copied by the From<&[u8]> for Vec<u8> implementation, but if a Vec<u8> or Box<[u8]> is given (or any other value with a zero-copy From<T> for Vec<u8> implementation) then we just keep the vec, give the pointer to curl, and no copies are made.

I'm assuming here that curl expects the blob to be kept alive for the duration of the easy handle, or until it is replaced by a subsequent curl_easy_setopt call. I didn't see any clear mention of lifetime requirements in the documentation. We should double-check that this is the case if we decide to go with this approach.

kulshrax commented 3 years ago

I've updated my commit to mirror the methods on Easy, and added an example program under examples/ that can be used to verify that this works.

In addition, I've added a second commit that implements zero-copy certificates using the approach described earlier. I separated this change since it introduces quite a bit of unsafety, so it probably needs some closer scrutiny on its own.

alexcrichton commented 3 years ago

Looks good! I'm not sure if there's much use storing the blobs internally though. The purpose of zero-copy I would imagine is that the curl bindings here wouldn't allocate them at all (as currently happens with to_vec). That would require some lifetime shenanigans though to ensure that the slice of bytes outlives the easy handle itself.

I think it's probably fine for now, since these are just ssl certs, to copy the bytes into curl and if it's ever an issue we could add more apis with more lifetimes that elides the copy.

kulshrax commented 3 years ago

I have no strong opinion either way wrt to zero-copy, so I'm fine with just merging my first commit. @sagebind, do you have any thoughts here?

sagebind commented 3 years ago

I think it's probably fine for now, since these are just ssl certs, to copy the bytes into curl and if it's ever an issue we could add more apis with more lifetimes that elides the copy.

This seems like a very reasonable approach to me. 👍

kulshrax commented 3 years ago

OK, in that case I've dropped my second commit from this branch, leaving just my initial approach.