crashvb / docker-registry-client-async

An AIOHTTP based Python REST client for the Docker Registry.
Apache License 2.0
5 stars 6 forks source link

Unable to re-use DRCA after close() #27

Closed skrech closed 2 years ago

skrech commented 2 years ago

Hello,

Currently, calling close() on DockerRegistryClientAsync instance renders that instance unable to initiate another aiohttp session. After calling close() the underlying session is closed but the variable reference for it is not set to None and subsequent calls to, eg., get_manifest fail with "Session is closed" error.

skrech commented 2 years ago

Hm... the problem is even more convoluted: Setting self.client_session to None is not enough. The TCPConnector is created once per _get_client_session invocation and saved into self.client_session_kwargs. However, when close() is called the connector is closed as well, but the closed instance is re-used once new instance of client_session is created, which is clearly a bug. Unfortunately, I don't see a good way out of this, except to disallowing self.client_session_kwargs entirely... :/

Or maybe, if no explicit connector is supplied by the user, don't save it in self.client_session_kwargs because the user expects to be able to close() and then re-use DRCA instance and that DRCA instance should take care of re-cycling the TCPConnector. If the user has supplied a connector instance himself/herself, then he/she would probably known what to do and would expect to handle this case.

crashvb commented 2 years ago

DRCA wasn't really intended to be re-entrant; the close method was exposed to allow for handling of edge-cases where asynchronous context managers were insufficient. The connector_owner flag could potentially be passed to the underlying lying client session via client_session_kwargs if the intention is to reuse the cached object for the next "iteration"; but this might make for an odd referencing of the buried object if / when explicit closing is needed (assuming that it is prior to the end of the process).

Either way, I agree that not assigning None to the object during close actively impedes the potential to do this.

skrech commented 2 years ago

Hey crashvb,

Thanks for the prompt reaction. However, I don't think this is the correct fix. The problem has nothing to do with re-entrancy of the code. I'll try to illustrate the problem with an example, which I think is very common use-case.

import docker_registry_client_async as drca

class DomainLogic:
  def __init__(self):
    self._client = drca.DockerRegistryClientAsync()

  async def get_layers(self, image_name):
    async with self._client as session:
      manifest_resp = await session.get_manifest(image_name)
      # process manifest
      ...

      layers_resp = await session.get_blob(...)
      # process blobs
      layers = ...

      return layers

logic = DomainLogic()
for image in (image_name1, image_2):
  await logic.get_layers(image)

On the second round in the loop, the program will crash with "Session is closed" error on get_manifest() call because the connector instance will be re-used and it's already closed.

Even more streamlined example:

import docker_registry_client_async as drca

client = drca.DockerRegistryClientAsync()
async with client as session:
  await session.get_manifest(...)
async with cleint as session:
  await session.get_manifest(...)  # Session is closed

I think that closing the drca intsance (using explicit close() or aexit) should not force the user to know the internals of the library and himself re-create the connector class, this is in implementation detail of the lib. The lib should re-cycle the connector class on every new creation of a session in my opinion.

crashvb commented 2 years ago

skrech,

Thank you for the more detailed explanation.

I don't know the overall design of the consuming application, but from the example provided above wrapping the invocation of the methods in an async with might not be adding value if the intention is to have the DRCA instance persist for the lifetime of the DomainLogic class.

The invocation of DockerRegistryClientAsync.close() could be deferred entirely if the class were restructured slightly:

import docker_registry_client_async as drca

class DomainLogic:
  def __init__(self):
    self._client = drca.DockerRegistryClientAsync()

  async def __aenter__(self) -> "DomainLogic":
    return self

  async def __aexit__(self, exc_type, exc_val, exc_tb) -> None:
    await self.close()

  async def close(self):
    if self.client:
        await self.client.close()

  async def get_layers(self, image_name):
    manifest_resp = await self._client.get_manifest(image_name)
    # process manifest
    ...

    layers_resp = await self._client.get_blob(...)
    # process blobs
    layers = ...

    return layers

logic = DomainLogic()
for image in (image_name1, image_2):
  await logic.get_layers(image)

The downstream crashvb/docker-sign-verify project might be of interest as an example.

... if wrapping with context managers is a must, then reentry can still be achieved by handing closure of the underlying TCPConnector object:

import docker_registry_client_async as drca

class DomainLogic:
  def __init__(self):
    self._client = drca.DockerRegistryClientAsync(client_session_kwargs={"connector_owner": False})

  async def close(self):
    if self.client:
        await self.client.client_session_kwargs["connector"].close()

  async def get_layers(self, image_name):
    async with self._client as session:
      manifest_resp = await session.get_manifest(image_name)
      # process manifest
      ...

      layers_resp = await session.get_blob(...)
      # process blobs
      layers = ...

      return layers

logic = DomainLogic()
for image in (image_name1, image_2):
  await logic.get_layers(image)

... which I agree is overly burdensome and does require quite a bit of knowledge of the internal call stack =/.

Ultimately, DRCA is just a restriction on the more generalized aiohttp library, which I believe would subject to the same limitations if aiohttp.ClientSession were substituted in place of DRCA for any of the example scenarios above.

skrech commented 2 years ago

Ah, I see. I've just tried and aiohttp ClientSession has the same "problem". I guess that's intended. Then, all is settled. :)