dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.07k stars 2.03k forks source link

Can't specify IP address along with localhost for config with MembershipTable #3278

Closed vadimkorr closed 7 years ago

vadimkorr commented 7 years ago

[.NET Framework 4.7, Orleans 1.5, Win10 Enterprise x64]

I'm using the following OrleansConfiguration.xml for deploying with the script.

In order not to change respective addresses on each node of the cluster I put the IP address to SeedNode (assuming that SeedNode is constant for each node, but Networking, ProxyingGateway are unique ones)

(Or at least how is it possible to use only one OrleansConfiguration.xml for the cluster, instead of changing it on each node?)

Thank you in advance!

<?xml version="1.0" encoding="utf-8"?>
<OrleansConfiguration xmlns="urn:orleans">
  <Globals>
    <SeedNode Address="192.168.50.74" Port="11111" />
    <Liveness LivenessType="MembershipTableGrain" />
    <BootstrapProviders>
      <Provider Type="OrleansDashboard.Dashboard" Name="Dashboard"/>
    </BootstrapProviders>
  </Globals>
  <Defaults>
    <Networking Address="localhost" Port="11111" />
    <ProxyingGateway Address="localhost" Port="30000" />
    <Tracing 
      DefaultTraceLevel="Info" 
      TraceToConsole="true"
      TraceToFile="{0}-{1}.log"
      WriteTraces="false"/>
  </Defaults>
</OrleansConfiguration>

This approach causes an error:

Could not connect to 192.168.50.74:11111: ConnectionRefused
Exception = Orleans.Runtime.OrleansException: Could not connect to 192.168.50.74:11111: ConnectionRefused
   at Orleans.Runtime.SocketManager.Connect(Socket s, IPEndPoint endPoint, TimeSpan connectionTimeout)
   at Orleans.Runtime.SocketManager.SendingSocketCreator(IPEndPoint target)
   at Orleans.Runtime.LRU`2.Get(TKey key)
   at Orleans.Runtime.Messaging.SiloMessageSender.GetSendingSocket(Message msg, Socket& socket, SiloAddress& targetSilo, String& error)

[2017-08-07 16:24:20.112 GMT    16      WARNING 101021  Runtime.Messaging.SiloMessageSender/SystemSender        127.0.0.1:11111]        Exception getting a sending socket to endpoint S192.168.50.74:11111:0
Exc level 0: Orleans.Runtime.OrleansException: Could not connect to 192.168.50.74:11111: ConnectionRefused
   at Orleans.Runtime.SocketManager.Connect(Socket s, IPEndPoint endPoint, TimeSpan connectionTimeout)
   at Orleans.Runtime.SocketManager.SendingSocketCreator(IPEndPoint target)
   at Orleans.Runtime.LRU`2.Get(TKey key)
   at Orleans.Runtime.Messaging.SiloMessageSender.GetSendingSocket(Message msg, Socket& socket, SiloAddress& targetSilo, String& error)
Silo S127.0.0.1:11111:239819047 is rejecting message: Request S127.0.0.1:11111:239819047Catalog@S0000000e->S192.168.50.74:11111:0DirectoryService@S0000000a #3: global::Orleans.Runtime.IRemoteGrainDirectory:LookupAsync(). Reason = Exception getting a sending socket to endpoint S192.168.50.74:11111:0
Exception = Orleans.Runtime.OrleansMessageRejectionException: Silo S127.0.0.1:11111:239819047 is rejecting message: Request S127.0.0.1:11111:239819047Catalog@S0000000e->S192.168.50.74:11111:0DirectoryService@S0000000a #3: global::Orleans.Runtime.IRemoteGrainDirectory:LookupAsync(). Reason = Exception getting a sending socket to endpoint S192.168.50.74:11111:0
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.<LookupAsync>d__114.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Scheduler.SchedulerExtensions.<>c__DisplayClass0_0`1.<<QueueTask>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Placement.RandomPlacementDirector.<OnSelectActivation>d__1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Placement.PlacementDirectorsManager.<SelectOrAddActivation>d__9.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Dispatcher.<AddressMessage>d__37.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Dispatcher.<AsyncSendMessage>d__34.MoveNext()

[2017-08-07 16:24:20.162 GMT    10      ERROR   100071  Dispatcher      127.0.0.1:11111]        !!!!!!!!!! SelectTarget failed with Silo S127.0.0.1:11111:239819047 is rejecting message: Request S127.0.0.1:11111:239819047Catalog@S0000000e->S192.168.50.74:11111:0DirectoryService@S0000000a #3: global::Orleans.Runtime.IRemoteGrainDirectory:LookupAsync(). Reason = Exception getting a sending socket to endpoint S192.168.50.74:11111:0
Exc level 0: Orleans.Runtime.OrleansMessageRejectionException: Silo S127.0.0.1:11111:239819047 is rejecting message: Request S127.0.0.1:11111:239819047Catalog@S0000000e->S192.168.50.74:11111:0DirectoryService@S0000000a #3: global::Orleans.Runtime.IRemoteGrainDirectory:LookupAsync(). Reason = Exception getting a sending socket to endpoint S192.168.50.74:11111:0
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.<LookupAsync>d__114.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Scheduler.SchedulerExtensions.<>c__DisplayClass0_0`1.<<QueueTask>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Placement.RandomPlacementDirector.<OnSelectActivation>d__1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Placement.PlacementDirectorsManager.<SelectOrAddActivation>d__9.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Dispatcher.<AddressMessage>d__37.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Dispatcher.<AsyncSendMessage>d__34.MoveNext()
Silo S127.0.0.1:11111:239819047 is rejecting message: Request S127.0.0.1:11111:239819047Catalog@S0000000e->S192.168.50.74:11111:0DirectoryService@S0000000a #3: global::Orleans.Runtime.IRemoteGrainDirectory:LookupAsync(). Reason = Exception getting a sending socket to endpoint S192.168.50.74:11111:0
Exception = Orleans.Runtime.OrleansMessageRejectionException: Silo S127.0.0.1:11111:239819047 is rejecting message: Request S127.0.0.1:11111:239819047Catalog@S0000000e->S192.168.50.74:11111:0DirectoryService@S0000000a #3: global::Orleans.Runtime.IRemoteGrainDirectory:LookupAsync(). Reason = Exception getting a sending socket to endpoint S192.168.50.74:11111:0
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.<LookupAsync>d__114.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Scheduler.SchedulerExtensions.<>c__DisplayClass0_0`1.<<QueueTask>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Placement.RandomPlacementDirector.<OnSelectActivation>d__1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Placement.PlacementDirectorsManager.<SelectOrAddActivation>d__9.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Dispatcher.<AddressMessage>d__37.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Dispatcher.<AsyncSendMessage>d__34.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.MembershipService.MembershipTableFactory.<GetMembershipTable>d__5.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.MembershipService.MembershipOracle.<Start>d__30.MoveNext()

[2017-08-07 16:24:20.187 GMT    10      ERROR   100646  MembershipOracle        127.0.0.1:11111]    !!!!!!!!!! MembershipFailedToStart
Exc level 0: Orleans.Runtime.OrleansMessageRejectionException: Silo S127.0.0.1:11111:239819047 is rejecting message: Request S127.0.0.1:11111:239819047Catalog@S0000000e->S192.168.50.74:11111:0DirectoryService@S0000000a #3: global::Orleans.Runtime.IRemoteGrainDirectory:LookupAsync(). Reason = Exception getting a sending socket to endpoint S192.168.50.74:11111:0
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.GrainDirectory.LocalGrainDirectory.<LookupAsync>d__114.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Scheduler.SchedulerExtensions.<>c__DisplayClass0_0`1.<<QueueTask>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Placement.RandomPlacementDirector.<OnSelectActivation>d__1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Placement.PlacementDirectorsManager.<SelectOrAddActivation>d__9.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Dispatcher.<AddressMessage>d__37.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.Dispatcher.<AsyncSendMessage>d__34.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.MembershipService.MembershipTableFactory.<GetMembershipTable>d__5.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Orleans.Runtime.MembershipService.MembershipOracle.<Start>d__30.MoveNext()
sergeybykov commented 7 years ago

Several things here.

  1. Using localhost as IP address (or 127.0.0.1 for that matter) doesn't allow connections from other machines. That may be the immediate problem you are seeing. Need to use a real IP address or host name instead.

  2. The configuration option of using a seed node (aka primary) is inherently unreliable because it requires the seed node to stay up the whole time. Have you considered a reliable configuration option with cluster membership table externalized to storage?

  3. It is possible to use a single config file for all nodes in the cluster. You can use Address="" to automatically pick up the host name of each silo. Seed node is the one address/host name that still has to be hardcoded. That works for a single silo per server. Otherwise, you can use overrides, but that's a bit more complicated to get right.

  4. We've been promoting programmatic configuration as the preferred way of configuring silos and clients. Going forward it will be even more so. XML configs still work, and will continue to be supported for some time. But beware that that's not the direction we plan to invest in.

vadimkorr commented 7 years ago

Sergey, thank you so much for fast and useful feedback. Specifying the Address="" and using real IP address (disabling Virtual Adapters used by virtual machines from Network Connections) solved the problem. Works like a charm!

sergeybykov commented 7 years ago

Great. I'll close the issue then. Thanks for confirming.