Closed kallewesterling closed 11 months ago
A quick test between the functions below yields no great result.
def bins_new(max, min, numBins):
return [[interval.left, interval.right] for interval in pd.cut([max, min], numBins).categories]
v.
def bins(max, min, numBins):
bin_ranges = []
increment = (max - min) / float(numBins)
for i in range(numBins - 1, -1, -1):
a = round(max - (increment * i), 20)
b = round(max - (increment * (i + 1)), 20)
bin_ranges.append([b, a])
return bin_ranges
bins_new
: 311 µs ± 4.65 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
bins
: 2.12 µs ± 6.79 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
To optimise the code, would it be possible to replace the package's
bins
function withpandas.qcut
andpandas.cut
?Resources