palantir / cassandra

Palantir's fork of Apache Cassandra
Apache License 2.0
10 stars 7 forks source link

Reduce schema pull request volume #582

Open andybradshaw opened 5 days ago

andybradshaw commented 5 days ago

Reduce schema pull request volume by only scheduling one request per schema version at a time. The thought is that any successful request for a schema version will return the same set of mutations, so there's no need to flood other nodes with unnecessary requests.

andybradshaw commented 11 hours ago

After adding some additional logging and moving where we remove from the outstanding schema pull request set:

{"type":"service.1","level":"DEBUG","time":"2024-11-25T18:33:05.569981Z","origin":"org.apache.cassandra.service.MigrationManager","safe":true,"thread":"GossipStage:1","message":"Evaluating schema pull criteria: currently scheduled requests for version {}: {}","params":{},"uid":null,"sid":null,"tokenId":null,"traceId":null,"stacktrace":null,"unsafeParams":{"0":"1f90579d-5612-3a6d-885c-f13f31e2027a","1":"[10.100.133.103/10.100.133.103, 10.100.137.98/10.100.137.98, 10.100.195.98/10.100.195.98, 10.100.98.98/10.100.98.98]"},"tags":{}}
{"type":"service.1","level":"DEBUG","time":"2024-11-25T18:33:05.570005Z","origin":"org.apache.cassandra.service.MigrationManager","safe":true,"thread":"GossipStage:1","message":"Not pulling schema because versions match or shouldPullSchemaFrom returned false","params":{},"uid":null,"sid":null,"tokenId":null,"traceId":null,"stacktrace":null,"unsafeParams":{},"tags":{}}
{"type":"service.1","level":"DEBUG","time":"2024-11-25T18:33:05.666239Z","origin":"org.apache.cassandra.service.MigrationManager","safe":true,"thread":"InternalResponseStage:33","message":"Gossiping my schema version {}","params":{},"uid":null,"sid":null,"tokenId":null,"traceId":"3e697fcfb98def74","stacktrace":null,"unsafeParams":{"0":"1f90579d-5612-3a6d-885c-f13f31e2027a"},"tags":{}}
{"type":"service.1","level":"DEBUG","time":"2024-11-25T18:33:05.686966Z","origin":"org.apache.cassandra.service.MigrationTask","safe":true,"thread":"InternalResponseStage:33","message":"Successfully processed response to schema pull, removing endpoint from scheduled schema pulls {}: {} ({})","params":{},"uid":null,"sid":null,"tokenId":null,"traceId":null,"stacktrace":null,"unsafeParams":{"0":"10.100.133.103/10.100.133.103","1":"1f90579d-5612-3a6d-885c-f13f31e2027a","2":"[10.100.133.103/10.100.133.103, 10.100.137.98/10.100.137.98, 10.100.195.98/10.100.195.98, 10.100.98.98/10.100.98.98]"},"tags":{}}
{"type":"service.1","level":"DEBUG","time":"2024-11-25T18:33:05.712994Z","origin":"org.apache.cassandra.service.MigrationManager","safe":true,"thread":"GossipStage:1","message":"Not pulling schema because versions match or shouldPullSchemaFrom returned false","params":{},"uid":null,"sid":null,"tokenId":null,"traceId":null,"stacktrace":null,"unsafeParams":{},"tags":{}}
{"type":"service.1","level":"DEBUG","time":"2024-11-25T18:33:05.919488Z","origin":"org.apache.cassandra.service.MigrationManager","safe":true,"thread":"GossipStage:1","message":"Not pulling schema because versions match or shouldPullSchemaFrom returned false","params":{},"uid":null,"sid":null,"tokenId":null,"traceId":null,"stacktrace":null,"unsafeParams":{},"tags":{}}
{"type":"service.1","level":"DEBUG","time":"2024-11-25T18:33:09.138266Z","origin":"org.apache.cassandra.service.MigrationManager","safe":true,"thread":"InternalResponseStage:34","message":"Gossiping my schema version {}","params":{},"uid":null,"sid":null,"tokenId":null,"traceId":"59dccc2121923d02","stacktrace":null,"unsafeParams":{"0":"1f90579d-5612-3a6d-885c-f13f31e2027a"},"tags":{}}
{"type":"service.1","level":"DEBUG","time":"2024-11-25T18:33:09.138320Z","origin":"org.apache.cassandra.service.MigrationTask","safe":true,"thread":"InternalResponseStage:34","message":"Successfully processed response to schema pull, removing endpoint from scheduled schema pulls {}: {} ({})","params":{},"uid":null,"sid":null,"tokenId":null,"traceId":null,"stacktrace":null,"unsafeParams":{"0":"10.100.137.98/10.100.137.98","1":"1f90579d-5612-3a6d-885c-f13f31e2027a","2":"[10.100.195.98/10.100.195.98, 10.100.98.98/10.100.98.98]"},"tags":{}}

Will test the timeout case and see if there are any easy ways to add unit tests for this