hazelcast / hazelcast-simulator

A tool for stress testing Hazelcast
Apache License 2.0
87 stars 74 forks source link

Do not crash if a random IP address tries to connect #1869

Open Holmistr opened 4 years ago

Holmistr commented 4 years ago

We're facing attacks from the outside during release process. Some random IP address is scanning ports and sends stuff to the Simulator coordinator port. However, since it doesn't conform to the Simulator packet layout, Simulator complains:

C_A9-54.173.85.28-agent.out:WARN  2020-05-07 12:53:26,651 [ConnectionHandler-138.19.164.135:54836] com.hazelcast.simulator.protocol.handler.ConnectionHandler: No magic bytes sent for 10 seconds, closing connection from [id: 0x5c55546e, L:/10.165.136.8:9000 - R:/138.19.164.135:54836]

which eventually results in a crash of the whole test.

jerrinot commented 4 years ago

I am trying to reproduce it, but so far I have not been successful. I'm trying with this branch: https://github.com/hazelcast/hazelcast-simulator/tree/0.9.10_beta_test

The original report says "sends stuff to the Simulator coordinator port", but the log snippet says: _"CA9-54.173.85.28-agent.out:WARN 2020-05-07 12:53:26,651 [ConnectionHandler-138.19.164.135:54836] com.hazelcast.simulator.protocol.handler.ConnectionHandler: No magic bytes sent for 10 seconds, closing connection from [id: 0x5c55546e, L:/10.165.136.8:9000 - R:/138.19.164.135:54836]"

See the filename: C_A9-54.173.85.28-agent.out:WARN. Hence I assume it's actually someone connecting to an agent rather than to a coordinator. AFAIK Coordinator does not listen to any server socket at all.

I tried to connect to both an agent and a worker but it's working as designed for me. I can see WARN 14:41:41 No magic bytes sent for 10 seconds, closing connection from [id: 0x45bac613, L:0.0.0.0/0.0.0.0:9000 ! R:/127.0.0.1:47432] in the log, the connection is killed, but the test is still running. I can see this behaviour was introduced by this changeset The changeset has rather poor tests with tons of mocking. I tried to write an integration test with real components instead of mocks yet it still working as designed for me.

Here is a test I am using to simulate this on a agent:

    @Test
    public void testAttackerSendingArbitraryRubbish() throws Exception {
        int port = agent.getPort();
        String ipAddress = agent.getPublicAddress();

        coordinator.workerStart(new RcWorkerStartOperation()
                .setHzConfig(hzConfig));

        TestSuite suite = newBasicTestSuite()
                .setDurationSeconds(60);

        StubPromise promise = new StubPromise();
        coordinator.testRun(new RcTestRunOperation(suite).setAsync(true), promise);
        assertEquals(SUCCESS, promise.get());

        String testId = promise.getResponse();
        assertEquals("C_A*_W*_T" + (initialTestIndex + 1), testId);

        SocketChannel channel = SocketChannel.open(new InetSocketAddress(ipAddress, port));
        try {
//            ByteBuffer buffer = ByteBuffer.wrap("".getBytes());
//            channel.write(buffer);
        } catch (Exception e) {
            e.printStackTrace();
        }

        assertTestCompletesEventually(testId);
    }

(paste this into CoordinatorTest)

@Danny-Hazelcast If you manage to reproduce this then please attach logs from coordinator, agent and also workers.