⚠️ CI failed ⚠️ - Big batch of regressions

github-actions[bot] commented 7 months ago

milesgranger commented 6 months ago

...just a bunch, here's a snapshot of the state and the dashboard for archival purposes for CI expires:

	category	type	mean	last	last-1	last-2	threshold
('coiled-upstream-py3.9', 'test_csv_basic', 'peak_memory') [GiB]	benchmarks	peak_memory	15.1235	17.4097	17.2679	17.3518	16.746
('coiled-upstream-py3.9', 'test_exploratory_analysis[client0]', 'peak_memory') [GiB]	workflows	peak_memory	114.711	128.141	128.076	127.382	118.627
('coiled-upstream-py3.9', 'test_from_csv_to_parquet[client0]', 'peak_memory') [GiB]	workflows	peak_memory	12.8298	14.2326	14.449	14.7537	13.8298
('coiled-upstream-py3.9', 'test_join_big_small[1-p2p]', 'duration') [s]	benchmarks	duration	24.7701	33.1218	43.6168	36.8432	27.722
('coiled-upstream-py3.9', 'test_join_big_small[1-tasks]', 'duration') [s]	benchmarks	duration	25.0883	30.6882	43.8817	38.3577	29.1201
('coiled-upstream-py3.9', 'test_preprocess', 'average_memory') [GiB]	benchmarks	average_memory	14.5801	16.4407	16.4278	17.4479	15.6858
('coiled-upstream-py3.9', 'test_preprocess', 'peak_memory') [GiB]	benchmarks	peak_memory	25.0429	35.3317	34.3489	35.1125	27.1709
('coiled-upstream-py3.9', 'test_q6[5 GB (parquet)-p2p]', 'duration') [s]	benchmarks	duration	9.72919	24.583	16.454	16.7745	11.9159
('coiled-upstream-py3.9', 'test_q6[5 GB (parquet)-p2p]', 'average_memory') [GiB]	benchmarks	average_memory	4.41377	7.58491	7.91839	8.06129	5.41377
('coiled-upstream-py3.9', 'test_q6[5 GB (parquet)-tasks]', 'duration') [s]	benchmarks	duration	7.2659	22.5871	19.3568	17.9487	8.74477
('coiled-upstream-py3.9', 'test_q6[5 GB (parquet)-tasks]', 'average_memory') [GiB]	benchmarks	average_memory	3.9634	7.86515	7.80749	8.06167	4.9634
('coiled-upstream-py3.9', 'test_q6[5 GB (parquet)-tasks]', 'peak_memory') [GiB]	benchmarks	peak_memory	7.01397	13.3225	14.9533	13.5694	8.88869
('coiled-upstream-py3.9', 'test_q8[5 GB (parquet)-p2p]', 'average_memory') [GiB]	benchmarks	average_memory	5.05994	11.4246	11.2664	11.3926	6.05994
('coiled-upstream-py3.9', 'test_q8[5 GB (parquet)-p2p]', 'peak_memory') [GiB]	benchmarks	peak_memory	5.9492	13.9234	13.9942	14.0321	6.9492
('coiled-upstream-py3.9', 'test_q8[5 GB (parquet)-tasks]', 'average_memory') [GiB]	benchmarks	average_memory	6.11063	11.1849	11.3102	11.3693	7.11063
('coiled-upstream-py3.9', 'test_q8[5 GB (parquet)-tasks]', 'peak_memory') [GiB]	benchmarks	peak_memory	7.76171	13.8378	13.9481	13.9317	9.07502
('coiled-upstream-py3.9', 'test_q9[5 GB (parquet)-p2p]', 'average_memory') [GiB]	benchmarks	average_memory	3.74961	5.2041	4.98061	5.17015	4.74961
('coiled-upstream-py3.9', 'test_q9[5 GB (parquet)-p2p]', 'peak_memory') [GiB]	benchmarks	peak_memory	5.83373	10.0524	10.8111	10.0099	6.83373
('coiled-upstream-py3.9', 'test_repeated_merge_spill', 'average_memory') [GiB]	stability	average_memory	9.08932	10.8184	10.8069	10.6927	10.0893
('coiled-upstream-py3.9', 'test_repeated_merge_spill', 'peak_memory') [GiB]	stability	peak_memory	15.1605	17.3027	17.0161	17.0043	16.2294
('coiled-upstream-py3.9', 'test_set_index_on_uber_lyft[p2p-client0]', 'average_memory') [GiB]	benchmarks	average_memory	77.2022	108.956	113.719	114.076	92.8309
('coiled-upstream-py3.9', 'test_set_index_on_uber_lyft[p2p-client0]', 'peak_memory') [GiB]	benchmarks	peak_memory	207.677	226.59	231.164	234.406	216.766
('coiled-upstream-py3.9', 'test_write_wide_data', 'average_memory') [GiB]	benchmarks	average_memory	33.0023	53.3788	52.8192	52.6435	34.4263
('coiled-upstream-py3.11', 'test_repeated_merge_spill', 'average_memory') [GiB]	stability	average_memory	9.45707	10.9045	10.8834	10.7944	10.4571
('coiled-upstream-py3.11', 'test_repeated_merge_spill', 'peak_memory') [GiB]	stability	peak_memory	15.498	17.4152	17.1928	17.2525	16.805

static-dashboard.zip

phofl commented 6 months ago

An increase in memory usage is not super surprising since we squash partitions more aggressively with dask-expr

milesgranger commented 6 months ago

Okay ya, seems from our discussions that all these are expected then. Thanks!

coiled / benchmarks

⚠️ CI failed ⚠️ - Big batch of regressions #1429