ets-labs / python-dependency-injector

Dependency injection framework for Python
https://python-dependency-injector.ets-labs.org/
BSD 3-Clause "New" or "Revised" License
3.98k stars 306 forks source link

Degraded performance on large data set manipulation application #800

Open titouanfreville opened 6 months ago

titouanfreville commented 6 months ago

Hello, I use dependency injector as a bases for my projects in python for some time now and meet an unexpected issue recently.

I am currently building a data analysis software aiming to analyse large set of data (~3Go of data for 20 millions rows) and the process takes an unexpectedly long time to run and has a larger resource consumption.

As a basis, just getting the data take ~3 minutes without injecting dependencies while its not done after 20 minutes using it.

I am mainly using singleton containers and create the base project using wire system.

The test ran on python 12 under Microsoft dev container: mcr.microsoft.com/vscode/devcontainers/python:1-3.12, and a windows server running python 12 (don't have exact version but it can be asked if needed).

I processed the data using SQLAlchemy with pyodbc driver + pandas readsql methods.

I cannot provide the dataset I'm using as its private to the company I'm working.

The application is wrapped behind a Typer client application using async method (though parallelization is not correctly done yet as I'm new to it :innocent: )

Any feed back on this or idea is welcome as I don't really see why using DI could impact the code so much on this case.

Thanks for your work and time. <3