locustio / locust

Write scalable load tests in plain Python 🚗💨
https://locust.cloud
MIT License
25.06k stars 3k forks source link

Workers go down with message: failed to send heartbeat, setting state to missing. #1843

Closed roquemoyano-tc closed 3 years ago

roquemoyano-tc commented 3 years ago

Describe the bug

every time that I run locust I'm always getting: locust-worker-xxxx failed to send heartbeat, setting state to missing.

Expected behavior

worker in running status

Actual behavior

I'm using the oficial helm chart to run locust, when I run the python code after some time the workers changes their status to missing and I'm not able to finish running the test.

Environment

mboutet commented 3 years ago

@roquemoyano-tc You need to provide the complete logs from both the master and at least one of the workers. Share them as gists as to not paste wall of logs in this issue.

roquemoyano-tc commented 3 years ago

sorry, in this link are the master and worker logs

https://gist.github.com/roquemoyano-tc/f23dc4a4c8c17da30fa6c101c55a0ad9

cyberw commented 3 years ago

Can you explain the "custom" entries in the slave log? What is this for example "Adding new paired device"

Also attach your locustfile.

amaanupstox commented 3 years ago

@cyberw

!interpreter [optional-arg]

class GRPCBackOfficeClient:
@stopwatch
def email_mobile_list(self):
    try:
        response = backoffice_customer_profile_service.get_email_mobile_list()
        assert response.metadata.success, "response should be true"

    except (KeyboardInterrupt, SystemExit):
        logging.error("Interrupted by keyboard............")
        sys.exit(0)

class GRPCBackOfficeLocust(FastHttpUser):
host = "https://{}".format(utils.get_properties("backoffice-service", "url"))
grpc_backoffice_client = GRPCBackOfficeClient()
wait_time = constant(0)

def on_start(self):
    """ on_start is called when a Locust start before any task is scheduled """
    pass

def on_stop(self):
    """ on_stop is called when the TaskSet is stopping """
    pass

@task
def family_client_group(self):
    """ To load test family client group gRPC call"""
    self.grpc_backoffice_client.family_client_group()
amaanupstox commented 3 years ago

@cyberw @mboutet pls help here, working fine with one slave, when i attach the second one: there will be an error called "failed to send heartbeat, setting state to missing." Locust: 1.4.3 Python: 3.7.8

elizabeth-tran commented 3 years ago

Will attach the locust file soon. The custom entries in the logs are just print statements from the different tasks that are being executed on the slave. The logic being using in the scripts is related to https://medium.com/locust-io-experiments/locust-experiments-feeding-the-locusts-cf09e0f65897 because I'm feeding the slaves with information about existing users from a csv file read in from the master.

cyberw commented 3 years ago

@cyberw @mboutet pls help here, working fine with one slave, when i attach the second one: there will be an error called "failed to send heartbeat, setting state to missing." Locust: 1.4.3 Python: 3.7.8

Start by updating locust. I dont know a specific bug in this area, but I dont want to spend time solving what might already have been solved :)

cyberw commented 3 years ago

Will attach the locust file soon. The custom entries in the logs are just print statements from the different tasks that are being executed on the slave. The logic being using in the scripts is related to https://medium.com/locust-io-experiments/locust-experiments-feeding-the-locusts-cf09e0f65897 because I'm feeding the slaves with information about existing users from a csv file read in from the master.

Ah, that might be key. That blog post speaks specifically about passing extra info from master to slave, and it could (theoretically at least) interrupt the normal locust communication. There is a new way to do that, using custom messages https://docs.locust.io/en/stable/running-locust-distributed.html?highlight=custom#communicating-across-nodes

Try that instead.

cyberw commented 3 years ago

@amaanupstox You're using a grpcclient? Make sure you have patched it to be gevent-friendly, like in the example in the docs: https://docs.locust.io/en/latest/testing-other-systems.html#example-writing-a-grpc-user-client

amaanupstox commented 3 years ago

@cyberw can you please let me know which server here is

def start_server():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    hello_pb2_grpc.add_HelloServiceServicer_to_server(HelloServiceServicer(), server)
    server.add_insecure_port("localhost:50051")
    server.start()
    logger.info("gRPC server started")
    server.wait_for_termination()

locust server or gRPC service hosted server?

cyberw commented 3 years ago

That is the grpc service used as a dummy target for the test. It is not something you would launch in a real test.

elizabeth-tran commented 3 years ago

The original issue was resolved once I switched to using custom messages. Thanks!