In my current test run, get_numpy seems to be the bottleneck. Now, the reading is no longer problematic (neither time-wise nor memory-wise) but a call to get_pyerrors waits an awfully long time for the internal get_numpy call (verified by scalene profile) and gets prohibitively expensive memory-wise for more than a few thousand configurations. Concerning the memory I haven't managed to convince scalene to give me a clear answer, yet, but given that the overwhelming majority of the time is spent in get_numpy and that the growth seems to be roughly linear, there is still a clear candidate.
More specifically, it is the sorting in the get_numpy function. Why are we sorting there? I mean, it was kind of convenient when I wrote the reader but why aren't the index columns just in the correct order? We could guarantee that right from the start and get rid of the sorting there.
In my current test run,
get_numpy
seems to be the bottleneck. Now, the reading is no longer problematic (neither time-wise nor memory-wise) but a call toget_pyerrors
waits an awfully long time for the internalget_numpy
call (verified byscalene
profile) and gets prohibitively expensive memory-wise for more than a few thousand configurations. Concerning the memory I haven't managed to convincescalene
to give me a clear answer, yet, but given that the overwhelming majority of the time is spent inget_numpy
and that the growth seems to be roughly linear, there is still a clear candidate.More specifically, it is the sorting in the
get_numpy
function. Why are we sorting there? I mean, it was kind of convenient when I wrote the reader but why aren't the index columns just in the correct order? We could guarantee that right from the start and get rid of the sorting there.