Hello,
I use dependency injector as a bases for my projects in python for some time now and meet an unexpected issue recently.
I am currently building a data analysis software aiming to analyse large set of data (~3Go of data for 20 millions rows) and the process takes an unexpectedly long time to run and has a larger resource consumption.
As a basis, just getting the data take ~3 minutes without injecting dependencies while its not done after 20 minutes using it.
I am mainly using singleton containers and create the base project using wire system.
The test ran on python 12 under Microsoft dev container: mcr.microsoft.com/vscode/devcontainers/python:1-3.12, and a windows server running python 12 (don't have exact version but it can be asked if needed).
I processed the data using SQLAlchemy with pyodbc driver + pandas readsql methods.
I cannot provide the dataset I'm using as its private to the company I'm working.
The application is wrapped behind a Typer client application using async method (though parallelization is not correctly done yet as I'm new to it :innocent: )
Any feed back on this or idea is welcome as I don't really see why using DI could impact the code so much on this case.
Hello, I use dependency injector as a bases for my projects in python for some time now and meet an unexpected issue recently.
I am currently building a data analysis software aiming to analyse large set of data (~3Go of data for 20 millions rows) and the process takes an unexpectedly long time to run and has a larger resource consumption.
As a basis, just getting the data take ~3 minutes without injecting dependencies while its not done after 20 minutes using it.
I am mainly using singleton containers and create the base project using
wire
system.The test ran on
python 12
under Microsoft dev container:mcr.microsoft.com/vscode/devcontainers/python:1-3.12
, and a windows server running python 12 (don't have exact version but it can be asked if needed).I processed the data using SQLAlchemy with pyodbc driver + pandas
readsql
methods.I cannot provide the dataset I'm using as its private to the company I'm working.
The application is wrapped behind a Typer client application using async method (though parallelization is not correctly done yet as I'm new to it :innocent: )
Any feed back on this or idea is welcome as I don't really see why using DI could impact the code so much on this case.
Thanks for your work and time. <3