Closed johnmartins closed 1 year ago
Another place which seems to be causing a lot of traffic:
populate_formula is run within a for loop, but the funciton itself contains two selector db-calls. Breaking these out into one or two calls that collect everything at once would significantly reduce traffic.
def get_all_formulas(db_connection: PooledMySQLConnection, project_id: int, vcs_id: int,
design_group_id: int) -> List[models.FormulaRowGet]:
logger.debug(f'Fetching all formulas with vcs_id={vcs_id}')
get_design_group(db_connection, project_id, design_group_id) # Check if design group exists and matches project
get_cvs_project(project_id)
get_vcs(project_id, vcs_id)
select_statement = MySQLStatementBuilder(db_connection)
res = select_statement.select(CVS_FORMULAS_TABLE, CVS_FORMULAS_COLUMNS) \
.inner_join('cvs_vcs_rows', 'vcs_row = cvs_vcs_rows.id') \
.where('vcs = %s and design_group = %s', [vcs_id, design_group_id]) \
.execute(fetch_type=FetchType.FETCH_ALL, dictionary=True)
if res is None:
raise exceptions.VCSNotFoundException
return [populate_formula(r) for r in res]
def populate_formula(db_result) -> models.FormulaRowGet:
return models.FormulaRowGet(
vcs_row_id=db_result['vcs_row'],
design_group_id=db_result['design_group'],
time=db_result['time'],
time_unit=db_result['time_unit'],
cost=db_result['cost'],
revenue=db_result['revenue'],
rate=db_result['rate'],
market_inputs=market_impl.get_all_formula_market_inputs(db_result['vcs_row']),
value_drivers=design_impl.get_all_formula_value_drivers(db_result['vcs_row'])
)
https://github.com/sed-group/sed-backend/blob/3c7f1acddaf5c49319204153702d5dcf2e74cb1f/sedbackend/apps/cvs/simulation/implementation.py#L147-L148
@obennet @EppChops the run_sim_monte_carlo function executes code that has (at least) triple-nested for-loops with DB calls inside of them. The function called in these lines of codes contains one additional hidden for-loop:
https://github.com/sed-group/sed-backend/blob/3c7f1acddaf5c49319204153702d5dcf2e74cb1f/sedbackend/apps/cvs/simulation/storage.py#L200-L213
This needs to be resolved, as it completely tanks DB performance. In general, avoid any DB calls inside for-loops. This is usually possible by building larger SQL statements. If you are aware of any other such places in the code, then please prioritize fixing it.