while inference by running server.py and client.py why client is taking gpu memory.

Hello, i am new to the triton and try to understand it's behaviour. i am facing one confusion which is given below :-

here I am running two client requests in one server.py. Why two client.py is consuming gpu memory by showing two gpu processes when model is running in server.

screen

here 967MiB has consumed by server.py script and 105 mb consumed by client.py files.

While multiple client request comes, is triton creating multiple instances of the same model or running on a single instance itself ?.

triton-inference-server / pytriton

while inference by running server.py and client.py why client is taking gpu memory. #47