dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.07k stars 2.03k forks source link

I need to make the Orleans public, but I was intercepted by the Orleans Gateway. #5709

Closed lfzm closed 4 years ago

lfzm commented 5 years ago

I need to make the Orleans public, but I was intercepted by the Orleans Gateway. I am deploying with Kubernetes.

ERROR:

Gateway received unexpected non-proxied connection from *sgn/01111111-1111-1111-1111-111111111111/11111111 at source address 172.18.76.51:36931

silo configuration:

image

client config:

s.UseStaticClustering(new IPEndPoint(IPAddress.Parse("172.18.76.56"), 30000));

Is it my legacy configuration or other issues?

lfzm commented 5 years ago

silo configuration:

  .ConfigureEndpoints(IPAddress.Parse("172.18.76.56"),11111, 30000,listenOnAnyHostAddress: true)

This configuration client can request, listenOnAnyHostAddress configuration seems to be invalid

ReubenBond commented 5 years ago

You should never expose Orleans ports directly to the Internet.

Is your connection failing in the "Development" mode or the production mode? UseLocalhostClustering will cause EndpointOptions & ClusterOptions to be overwritten.

Otherwise, your configuration looks correct to me. You could inspect the EndpointOptions configuration using a debugger, or print it out:

.Configure<EndpointOptions>(o =>
{
  Console.WriteLine(o.AdvertisedIPAddress);
  Console.WriteLine(o.SiloPort);
  Console.WriteLine(o.SiloListeningEndpoint);

  Console.WriteLine(o.GatewayPort);
  Console.WriteLine(o.GatewayListeningEndpoint);
})
.Build(); // Do it just before `Build()` so it's the last operation.

Hope that helps to diagnose it

lfzm commented 5 years ago

@ReubenBond

You should never expose Orleans ports directly to the Internet.

Our test environment is deployed in the K8S, exposing the port of Orléans directly to the Internet for testing and development.

Is your connection failing in the "Development" mode or the production mode? UseLocalhostClustering will cause EndpointOptions & ClusterOptions to be overwritten.

Development is only used for local testing, this error is in the production environment

print out EndpointOptions configuration : image

ReubenBond commented 5 years ago

The endpoint options you're printing out look correct to me - is that not working? Is the AdvertisedIPAddress correct for your environment?

lfzm commented 5 years ago

172.20.1.236 is the K8S Pod IP, but not the IP requested by the Client, I think AdvertisedIPAddress = 0.0.0.0

K8S Service for Orleans Silo Pod Agent

ReubenBond commented 5 years ago

@galvesribeiro do you have any ideas about what might be wrong here?

Are we able to dump the membership data from k8s via kubectl?

lfzm commented 5 years ago

I stored the membership data through the K8S CRD. There is no problem with the client request in the K8S cluster. The K8S service to the Orleans agent and the K8S cluster external client request failed.

lfzm commented 5 years ago

I analyzed the source code problem in this condition, but I don't understand the internal principle Orleans.Runtime/Messaging/IncomingMessageAcceptor.cs#L139

ReubenBond commented 5 years ago

I analyzed the source code problem in this condition, but I don't understand the internal principle Orleans.Runtime/Messaging/IncomingMessageAcceptor.cs#L139

The silo port (11111 in your case) only accepts connections from other silos. The gateway port (30000 in your case) only accepts connections from clients.

The code you're pointing to is part of the code which ensures those conditions above

ReubenBond commented 5 years ago

For some reason, you are seeing a silo trying to connect to the gateway port (30000)

galvesribeiro commented 5 years ago

I don’t understand what is the real use case here nor the problem but, I can explain how the K8s membership works so you can query with kubectl...

So, the CRDs are just definitions. Aka Schemas.

The CRD-based objects are namespaced. Meaning when querying from the kubectl you need to pass —namespace myNamespace.

Inside the namespace, you have the groups that holds the real api object instances based on the CRDs. The default is orleans.dot.net.

So, try get a CRD using both namespace and group your are using and you should see the records. Sorry I’m on mobile now unable to run it to get a cmd sample for you...

Let me know if you need more help...

ReubenBond commented 5 years ago

Thank you @galvesribeiro :)

@lfzm Could you describe this "Agent" in more detail? Is it an Orleans client or a silo? Is it on a different network to the other client? It looks as though you are trying to put a Load Balancer in front of the gateway port? How is that configured?

I want to emphasize again: you should not expose Orleans silo to the internet - you should use a VPN or similar.

lfzm commented 5 years ago

Sorry, maybe my description is not clear, I drew a picture, I hope I can express it clearly.

未命名表单 (2)

galvesribeiro commented 5 years ago

hummm

Ok... It looks to me that this is related to which IP Orleans is internally listening vs the one which is publicly exposed.

@ReubenBond I remember we talk about that a while ago and I had the same issue.

In @lfzm's case, he is talking about internet (which I agreed shouldn't be open now for clients) but, the same problem would happen if a client that is not under kubernetes network (for example, a VM which is not a kubernetes host) would try connect to Orleans...

The problem here looks to me that it is not actually related to kubernetes itself (if it is the same case I had before) but instead, to any scenario where you have 2 different vNets and NAT is happening behind the scenes.

lfzm commented 5 years ago

@ReubenBond Agent you can understand it is NAT or Load Balancer,Orleans ports are not recommended for Internet exposure, but cannot be avoided

@galvesribeiro mark

ReubenBond commented 5 years ago

Try setting GatewayListeningEndpoint & SiloListeningEndpoint with a specific IP+Port as well as setting the AdvertisedIPAddress, SiloPort, & GatewayPort

lfzm commented 5 years ago

@ReubenBond Setting AdvertisedIPAddress to 45.56.89.5 can solve the problem, but I think the Orleans gateway port exposure should be controlled by the firewall.

galvesribeiro commented 5 years ago

@lfzm yeah, that is something I've being arguing with the team from ages but no, we shouldn't open the silo ports to the internet with the current design/implementation.

ReubenBond commented 5 years ago

@galvesribeiro @lfzm what should be changed?

lfzm commented 5 years ago

@galvesribeiro Thank you for your help. Temporarily solve this problem and look forward to the Orleans team making changes

galvesribeiro commented 5 years ago

And in the case you are using Kubernetes, I would create the frontend "inside" kubernetes boundaries and then expose it as HTTP, WebSocket or whatever you want...

galvesribeiro commented 5 years ago

@galvesribeiro @lfzm what should be changed?

@ReubenBond thread for another issue (I'm sure there had others) but to sumarize:

  1. Open the client protocol as a specification as that would allow people to implement it on multiple stacks/technologies/frameworks/languages and let them deal with the security.
  2. Remove the hard dependency to the internal IP address of the silo and allow NAT'ed connections to hit the silo (which implies on 1).
lfzm commented 5 years ago

@galvesribeiro I did this now, but I need to expose the Orleans port when developing the Http frontend.

galvesribeiro commented 5 years ago

@galvesribeiro I did this now, but I need to expose the Orleans port when developing the Http frontend.

Not necessarily... You can deploy and debug from a service/pod running inside the cluster. That would allow you to run the development client on the same network as the PODs running the cluster.

lfzm commented 5 years ago

@ReubenBond galvesribeiro His description is what I want to express

lfzm commented 5 years ago

@galvesribeiro This is a good proposal, it would be better if it could be debugged locally.

galvesribeiro commented 5 years ago

@galvesribeiro This is a good proposal, it would be better if it could be debugged locally.

You can still run and debug a Kubernetes application entirely on your machine. Just Install Docker for Windows/Mac/Linux and you can replicate the whole environment on your machine.

If you are in Azure, you can use Azure Dev Spaces for that. And you whole Kubernetes cluster would be running there, but you can debug pieces of the service "locally". For example, your Http frontend...

ReubenBond commented 4 years ago

Closing this due to inactivity

ReubenBond commented 4 years ago

Closing this issue - you can open another issue to discuss VS debugging in Kubernetes, etc, if needed. There is more work that should happen on networking for 4.0 - likely to use BedrockFramework instead of using internal implementations with the shared abstractions.

RockNHawk commented 4 years ago

@ReubenBond

I have the same issue,visit by the interner ip failed,but by intranet ip is OK. make the Orleans public because we need a development environument.

image

ReubenBond commented 4 years ago

@RockNHawk, is your Agent an HTTP server? I suggest opening a new issue for us to talk

RockNHawk commented 4 years ago

is your Agent an HTTP server? I suggest opening a new issue for us to talk

Than you for your reply, I use from the Orleans sample project Blazor https://github.com/dotnet/orleans/tree/master/Samples/3.0/Blazor/Sample.Silo