envoyproxy / envoy

Cloud-native high-performance edge/middle/service proxy
https://www.envoyproxy.io
Apache License 2.0
24.91k stars 4.79k forks source link

Remove curl as an Envoy dependency #11816

Open htuch opened 4 years ago

htuch commented 4 years ago

Currently we depend on libcurl for URL fetching in AWS extension common utils https://github.com/envoyproxy/envoy/blob/89d6c6c6fa202aa7a01e85d2d887b56c8a3268ad/source/extensions/common/aws/utility.cc#L97 and I think OpenCensus (@g-easy can you confirm?)

The use of curl is largely redundant, since Envoy itself can do HTTP fetch. In addition, Curl does not have a compatible threading and observability model with Envoy. The recent disclosures of CVE-2020-8169 and CVE-2020-8177 provide an example of why we should eliminate this from our trusted compute base.

Opening this tracking ticket to discuss further whether we can remove this dep.

mattklein123 commented 4 years ago

+1 for removal. I was against adding in the first place.

moderation commented 4 years ago

+1 for removal

g-easy commented 4 years ago

Yes, OpenCensus uses curl in its zipkin exporter (to push traces to zipkin).

kyessenov commented 4 years ago

Having an indirect networking API instead of curl in libraries would also help compiling them to Wasm binaries for sandboxing. The API would be implemented via Wasm ABI.

htuch commented 4 years ago

@g-easy would it be possible to modify the Zipkin exporter to work with some more generic HTTP interface that Envoy can supply?

moderation commented 4 years ago

As per https://github.com/envoyproxy/envoy/issues/9958 at some point OpenCensus should be replaced by the new OpenTracing + OpenCensus project called OpenTelemetry. One would hope that OpenTelemetry doesn't require curl

dio commented 4 years ago

For opencensus, since the "issue" is with the Zipkin span exporter implementation (opencensus_exporter_zipkin). I think we could own Envoy-specific implementation of that in our tree (I can try to own it if it is desired)? At minimum:

class ZipkinSpanExporterHandler : public ::opencensus::trace::exporter::SpanExporter::Handler,
                                  public Http::AsyncClient::Callbacks {
  ...
};

Which basically "retrieves" the spans stored by opencensus-cpp worker thread implementation and send that to Envoy's thread which has access to a dispatcher (I think server's dispatcher is OK to use here). As a "bonus", we also need to serialize the opencensus::trace::exporter::SpanData to Zipkin's v2 via ProtobufWkt::Struct (instead of rapidjson employed by the current opencensus-cpp Zipkin span exporter implementation).

However, this will "force" the configuration to require a defined collector cluster to let Envoy's async-client to work (I think this is also relevant to the approach for the "aws" utility). Even the dynamic forward proxy requires a cluster definition. I wonder if we can/are allowed to somehow construct a cluster without defining in the config, but build it on the fly when it is required (https://github.com/envoyproxy/envoy/pull/3479 cc. @ramaraochavali).

@moderation, when we are needed to "upgrade" to OpenTelemetry, if the approach is the same (i.e. data is gathered in worker thread managed by the "tracer" library) and we want to support Zipkin exporter, we still need to provide an Envoy's async-client-friendly implementation for that.

g-easy commented 4 years ago

One would hope that OpenTelemetry doesn't require curl

This is tracked in https://github.com/open-telemetry/opentelemetry-cpp/issues/6

g-easy commented 4 years ago

https://github.com/envoyproxy/envoy/issues/11816#issuecomment-653838495 sounds pretty great to me.

Note that opencensus's existing background thread calls into Handler::Export() periodically.

dio commented 4 years ago

@mattklein123 what is the best way of resolving URI for async client calls without explicitly defining a cluster for it? Is https://github.com/envoyproxy/envoy/pull/3479 the way to do it? Need your advice on this. Thank you!

mattklein123 commented 4 years ago

@dio yeah we would probably need to add new functionality to AsyncClient to optionally expose the dynamic forward proxy functionality.

wrowe commented 4 years ago

PR13063 is another example, which picks up the fix of

CVE-2020-8231: libcurl: wrong connect-only connection

As Envoy does NOT use the affected CURLOPT_CONNECT_ONLY toggle, this change simply satisfies overly-simplistic audits.

htuch commented 4 years ago

As part of my EnvoyCon talk, I looked at all CVEs for our dependencies since 2018. Curl is by far the worst here, with 14 CVEs and 23 releases since 2018. Further evidence that we really want to make some progress here.

htuch commented 3 years ago

Bumping priority given the high CVE rate (almost always false positives as well).

moderation commented 3 years ago

OpenTelemetry went 1.0.0 is Feb - https://medium.com/opentelemetry/opentelemetry-specification-v1-0-0-tracing-edition-72dd08936978. Looks the Zipkin exporter still needs Curl

I was thinking we could deprecate OpenCensus in favor of OT but if Curl is still present it doesn't look like it will help us. Hopefully the OT experts can provide some more context

tonya11en commented 3 years ago

You can assign this to me. It looks like it's mostly the AWS filters.

tonya11en commented 3 years ago

I was looking at this over the weekend and it's not as trivial as I first thought to rip this out of the AWS filters. To use the HTTP async client, we need a target cluster instead of just an IP like we have now with libcurl. This means that the cluster configs for users of the filters are going to need to include a target cluster for the metadata fetches similar to what the jwt authn filter does with the example_jwks_cluster.

I can do it, but it'll change how the AWS filters get configured and cause failures with the old configs. @abaptiste do you (or other AppMesh folks) have an opinion here?

htuch commented 3 years ago

Is there a way to use dynamic forward clusters to avoid having to change the API surface?

tonya11en commented 3 years ago

Is there a way to use dynamic forward clusters to avoid having to change the API surface?

It looks like we're still going to need to have additional configuration, but it's for the DNS cache to use and dynamic forward proxy cluster config.

The frustrating part here is that these filters are reading full URIs out of env variables and feeding them into curl. Otherwise, we'd have been able to just update the cluster manager with static cluster configs and just use the HTTP async client like everywhere else.

It's not clear to me why these filters are relying on environment variables instead of the Envoy config.

wrowe commented 3 years ago

Its because http[s]_proxy vars are a very well accepted convention, and it makes the stock envoy config more mutable. That doesn't make it ideal, but you raised the question why, and I think we can and should revisit the earlier design decisions

zirain commented 2 years ago

@tonya11en are you still working on this?

tonya11en commented 2 years ago

@zirain no, this turned out to be more work than I was expecting to do in my free time :(. Feel free to pick it up and I'll happily help review.

alyssawilk commented 2 years ago

@lavignes as the other AWS filter owner, would you be able to take a look? If not we may need to bump the aws filter to contrib/

suniltheta commented 2 years ago

If we decide to move it to contrib due to not able to provide fix at the right time, I can take up the action item to move it to contrib.

alyssawilk commented 2 years ago

Awesome, thank you!

suniltheta commented 2 years ago

We are planning to move ahead with removing libcurl dependency from AWS extensions. I have put a small doc to indicate the approach and consequences.

https://docs.google.com/document/d/1MxREfLX4BONhOCNr7dvHwdr1_kArLb3OOSvhb-b02oQ

I look forward to all of your opinions on this. I am mainly seeing your thoughts on

  1. Whether (Option #2, don’t need explicitly defined cluster config) mentioned in the doc is a workable solution.
  2. Put the existing config/code path behind a feature flag ( marked as deprecated to be cleaned in future ) so that users of these extensions transitioning from earlier versions are not forced to instantly switch to the new configuration model.
daixiang0 commented 2 years ago

Is it still in process?

suniltheta commented 2 years ago

Yes, working on this. At testing stage now. I will start with couple of PRs in next few weeks. Sorry for the delay.

mr-miles commented 2 years ago

Sorry to bug - do you think it will land? I've ended up here via detective work trying to find out why IRSA wasn't working as I expected! (... and I've learnt a lot about aws in the process)

suniltheta commented 2 years ago

I am still working on the changes. I haven't got enough time to dedicate to this task.

In the current state I am able to make http calls to fetch the credentials but don't know how it can be fetched in main thread and updated on all the worker thread. https://github.com/envoyproxy/envoy/compare/main...suniltheta:envoy:cm_ref_2

Also, the change is no where close to the version where I would be able to raise the PR.


If you want to make IRSA work, then I can point you towards a patch https://github.com/envoyproxy/envoy/pull/23408. But unfortunate that still uses curl.

mr-miles commented 2 years ago

Thank you so much for the quick response. I would offer to help but this is miles away from my knowledge area...

I'm sure you thought of this, but just for my own knwoledge - since the lambda filter is still beta and already uses curl, is there a downside of incorporating the patch in the interim, while libcurl is being expunged? The patch didn't seem to be increasing the problem, but my reading of the patch is pretty basic

suniltheta commented 1 year ago

Made some progress. Still working on improving the unit testing. I will be raise the PRs shortly after finalizing the PR breakup.

suniltheta commented 1 year ago

/assign

mr-miles commented 1 year ago

thanks for the update! if you're able to publish an image somewhere i can get at from a helm chart then can give it a spin

ceastman-r7 commented 1 year ago

@suniltheta Any update on this?

suniltheta commented 1 year ago

Currently in progress in https://github.com/envoyproxy/envoy/pull/29880.

@ceastman-r7 are you using any particular filter?

ceastman-r7 commented 12 months ago

This was more of a concern of vulnerability scans flagging curl. for example:

CVE-2023-38545 curl:7.81.0-1ubuntu1.13 HIGH SOCKS5 heap buffer overflow

CVE-2023-38545 curl:7.81.0-1ubuntu1.13 HIGH SOCKS5 heap buffer overflow

which I was removing curl from the istio-pilot operating system image but another istio developer questioned if that mattered since curl was built into envoy.

phlax commented 12 months ago

... questioned if that mattered since curl was built into envoy.

it doesnt matter to envoy - as said its compiled in

not sure where curl is coming from - we dont seem to have it any of our images

my advice would be to use the distroless base so you dont have to worry about extraneous deps, not sure how istio pilot is built - for some reason i thought they did use a distroless base