googleforgames / open-match

Flexible, extensible, and scalable video game matchmaking.
http://open-match.dev
Apache License 2.0
3.17k stars 335 forks source link

Can't get match assignment #1458

Closed Bardin08 closed 2 years ago

Bardin08 commented 2 years ago

What happened: ../v1/frontendservice/tickets/{TICKET_ID}/assignments is an endless call.

When I browse logs at the k8s pod, I see these errors:

Director service

←[41m←[30mfail←[39m←[22m←[49m: Director[0]
      Failed to fetch matches for profile TeamMatch, got Request Error 404 - {"code":5,"message":"Not Found"}
      System.Exception: Failed to fetch matches for profile TeamMatch, got Request Error 404 - {"code":5,"message":"Not Found"}
       ---> System.Exception: Request Error 404 - {"code":5,"message":"Not Found"}
         at Argentics.Backend.Matchmaking.Director.Director.Fetch(OpenMatchMatchProfile profile) in /app/Director.cs:line 126
         --- End of inner exception stack trace ---
         at Argentics.Backend.Matchmaking.Director.Director.Fetch(OpenMatchMatchProfile profile) in /app/Director.cs:line 126
         at Argentics.Backend.Matchmaking.Director.Director.MatchPlayers() in /app/Director.cs:line 53

MMF service

There're no logs

Backend service

$ k -n open-match logs open-match-backend-59b548476-rlfrm -f
time="2022-05-06T09:30:17Z" level=warning msg="Trace logging level configured. Not recommended for production!"
time="2022-05-06T09:30:17Z" level=info msg="Tracing sampler fraction set" app=openmatch component=telemetry samplingFraction=0.01
time="2022-05-06T09:30:17Z" level=info msg="Telemetry reporting period set" app=openmatch component=telemetry reportingPeriod=1m0s
time="2022-05-06T09:30:17Z" level=info msg="Jaeger Tracing: Disabled" app=openmatch component=telemetry
time="2022-05-06T09:30:17Z" level=info msg="Prometheus Metrics: Disabled" app=openmatch component=telemetry
time="2022-05-06T09:30:17Z" level=info msg="StackDriver Metrics: Disabled" app=openmatch component=telemetry
time="2022-05-06T09:30:17Z" level=info msg="OpenCensus Agent: Disabled" app=openmatch component=telemetry
time="2022-05-06T09:30:17Z" level=info msg="zPages: ENABLED" app=openmatch component=telemetry endpoint=/debug
time="2022-05-06T09:30:17Z" level=info msg="Serving HTTP: [::]:51505" app=openmatch component=api.backend
time="2022-05-06T09:30:17Z" level=info msg="Serving gRPC: [::]:50505" app=openmatch component=api.backend
time="2022-05-06T09:30:34Z" level=warning msg="/healthz health check failed. The server will terminate if this continues to happen." app=openmatch component=telemetry error="rpc error: code = Unavailable desc = dial tcp 10.100.103.229:26379: i/o timeout"
time="2022-05-06T09:30:44Z" level=warning msg="/healthz health check continues to fail. The server is at risk of termination." app=openmatch component=telemetry error="rpc error: code = Unavailable desc = dial tcp 10.100.103.229:26379: i/o timeout"
time="2022-05-06T09:30:54Z" level=info msg="/healthz is healthy again." app=openmatch component=telemetry
time="2022-05-09T08:53:45Z" level=error msg="error(s) in FetchMatches call. syncErr=[error receiving match from synchronizer: rpc error: code = Unknown desc = error calling evaluator: error starting evaluator call: rpc error: code = Unavailable desc = last resolver error: produced zero addresses], mmfErr=[rpc error: code = Internal desc = failed to get response from mmf run for profile TeamMatch: Post \"http://localhost:50671/v1/matchfunction:run\": dial tcp 127.0.0.1:50671: connect: connection refused]" app=openmatch component=api.backend
time="2022-05-09T08:54:28Z" level=error msg="error(s) in FetchMatches call. syncErr=[error receiving match from synchronizer: rpc error: code = Unknown desc = error calling evaluator: error starting evaluator call: rpc error: code = Unavailable desc = last resolver error: produced zero addresses], mmfErr=[rpc error: code = Internal desc = failed to get response from mmf run for profile TeamMatch: Post \"http://localhost:50502/v1/matchfunction:run\": dial tcp 127.0.0.1:50502: connect: connection refused]" app=openmatch component=api.backend

Here we have an exciting part that says we're trying to access an MMF service from the backend at the http://localhost:50502/v1/matchfunction:run\.

mmfErr=[rpc error: code = Internal desc = failed to get response from mmf run for profile TeamMatch: Post \"http://localhost:50502/v1/matchfunction:run\": dial tcp 127.0.0.1:50502: connect: connection refused]

I think that that's why the issue can be, but at the Open Match sources I can't find a call via localhost or 127.0.0.1. Can this be caused by k8s port-forwarding that I use for local debugging?

What you expected to happen: I received a match with a filled assignment.

How to reproduce it (as minimally and precisely as possible): I think no repro is required because the error is at the default services from the docs.

Output of kubectl version:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.5", GitCommit:"c285e781331a3785a7f436042c65c5641ce8a9e9", GitTreeState:"clean", BuildDate:"2022-03-16T15:58:47Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.9-eks-0d102a7", GitCommit:"eb09fc479c1b2bfcc35c47416efb36f1b9052d58", GitTreeState:"clean", BuildDate:"2022-02-17T16:36:28Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.23) and server (1.21) exceeds the supported minor version skew of +/-1

Cloud Provider/Platform (AKS, GKE, Minikube etc.): EKS

Open Match Release Version: v1.4.0-rc1

Install Method(yaml/helm): yaml

Bardin08 commented 2 years ago

UPD: Backend contains no logs anymore Here is a request that Director sends to the backend service: Fetch request: Url: http://10.1xx.xxx.xx:51505/v1/backendService/matches:fetch, body: {"config":{"host":"10.100.98.207","port":50502,"type":"REST"},"profile":{"name":"TeamMatch","pools":[{"name":"tm1","tag_present_filters":[{"tag":"tm1"}],"double_present_filters":[{"double_arg":"attribute.pwr","min":0.0,"max":0.0,"exclude":"NONE"}]}],"extensions":null}} Response body as string: {"code":5,"message":"Not Found"}

Logs at the backend service were a result of a missed evaluator. Now I added it, and that logs disappeared, but I still receive {"code": 5, "message": "Not Found"}

mridulji commented 2 years ago

Hey @Bardin08, Could you share the definition of your profiles in the director?

Bardin08 commented 2 years ago

@mridulji, yea sure. Here is a profile that Director sends to the backend service

{
   "config":{
      "host":"mmf-service.namespace.svc.cluster.local",
      "port":51503,
      "type":"REST"
   },
   "profile":{
      "name":"TeamMatch",
      "pools":[
         {
            "name":"tm1",
            "tag_present_filters":[
               {
                  "tag":"tm1"
               }
            ],
            "double_present_filters":[
               {
                  "double_arg":"attribute.pwr",
                  "min":0.0,
                  "max":10.0,
                  "exclude":"NONE"
               }
            ]
         }
      ],
      "extensions":null
   }
}

And here is a ticket model:

{
    "id": "c9uekfcqo521apilbcig",
    "search_fields": {
        "double_args": {
            "attribute.pwr": 9.3
        },
        "tags": [
            "tm1"
        ]
    },
    "create_time": "2022-05-12T11:06:37.072182879Z"
}
Bardin08 commented 2 years ago

@mridulji @syntxerror is there any way to log all requests and responses out of the box? I can't see logs at OM core pods and I still can't understand how to understand the reason for that error