Improve storage manager and merge it with `creation_management` module

vdusek commented 4 months ago

The current Crawlee / StorageClientManager is more or less just copied from the Python SDK / StorageClientManager and is extremely simple. Its primary role is to maintain and provide access to storage client instances based on specific input parameters.

The Crawlee TS / StorageManager is more complex and it takes care of more things - creating instances of storages & their caching.

Currently, we have a helper module "creation_management" in storages/ which helps with it.

Let's move logic from storages/creation_management to StorageClientManager and improve the creation & caching process.

Functions get_or_create, find_or_create_client_by_id_or_name a create_*_from_directory should be refactored.

janbuchar commented 4 months ago

I could also imagine putting the functionality into a module instead of a singleton class, so basically StorageManager -> creation_management, not vice versa.

janbuchar commented 4 months ago

This code should not check the implementation in use - it's a generic storage manager that should not be concerned with the concrete implementation.

apify / crawlee-python

Improve storage manager and merge it with `creation_management` module #147