Closed jreback closed 11 years ago
I think that the multi-level reindexing ones are possibly related to the take
changes (#2819)
(which help a great deal in the most common cases), I didn't investigate much, that's just the top
on profiling (in both old and new)
bisected?
vbench was meant exactly for that, but it hasn't been updated in a while. Too bad.
can we run it on each commit and get the graph?
That's it's reason for being. I just added test_perf a few months ago for the devs.
agreed....how does it update though?
I think wes goes into the basement and presses a button.
we're getting too chatty on the issue tracker, need an irc channel. or campfire.
FWIW, I lurk and learn, but I'm sure others hate the noise just as much as I find it helpful. A dev list might be preferable to IRC unless you can commit resources to a bot to archive everything in a publicly searchable way. I don't know how public campfire is.
We're definitely pushing the boundaries of S/N, that's not cool. Email is inconvenient for realtime, so not really a solution for many things. I believe freenode has web based logs for all channels, though I'm not sure I consider that a plus.
fyi...i think i found this regression (my issue!)...issue a fix in a bit
I could set up a hipchat for pandas devs if you like, but that wouldn't be public.
@wesm , punting on the chat quesion.
would it be possible to get a daily vbench going? we're discovering perf regression by mistake...
hmm, well, i have kept all my old vb_suite logs (including several for testing #2819, #2867 against their immediate base commits) and I can't find any entries from where reindex_frame_level_*
were affected negatively (in fact, a lot of results had the test improved by 20+%)...it's always possible there is some complex interaction between multiple lines on work that were developed independently, though...also I'm surprised frame_reindex_axis1
isn't on the positive side since I have consistent >25% improvements in that one.
pretty easy to break perf; thxs for checking
one thing to note is that i'm on 32-bit, that might make a difference... also, here's my last vb_suite logs for the two commits #2819: http://pastebin.com/xEHn0Mhz, #2867: http://pastebin.com/sFccHx1K
@y-p is there a way to 'save' a vb_suite db for a particular commit, so that subsequent runs of test_perf can just refer to that rather than rebuilding it? (and if not, maybe make an option in test_perf?)
or maybe just save the build (rather than the db), but that would still save the get/build time
vb_suite has a persistent db, test_perf purges the db between runs. I'll add an option when I get a chance.
@stephenwlin
I did a git bisect....here's what broke reindex_frame_level_align
and reindex
prob just an odd special case
10f65e02536b4dda0ab7761cb1d10e274b208cfb is the first bad commit
commit 10f65e02536b4dda0ab7761cb1d10e274b208cfb
Author: Stephen Lin <stephenwlin@gmail.com>
Date: Wed Feb 13 03:33:35 2013 -0500
ENH: optimize Cython take functions with memmove where possible
:040000 040000 c9e8c83951445333494415e5169abf7a7bad951c 45efa72631f1530adf6858155f3aa2bca0204907 M pandas
[cow-jreback-~/pandas] git bisect log
git bisect start
# good: [05a737dbb9f857e25a211739d90c42f3d9428b8e] TST: skip problematic xlrd test
git bisect good 05a737dbb9f857e25a211739d90c42f3d9428b8e
# bad: [605f87edda24b07f0f0a1b2a2462a4e121629494] DOC: update RELEASE.rst
git bisect bad 605f87edda24b07f0f0a1b2a2462a4e121629494
# good: [9faec8293aa7fda64b85f3aa05bff5600c7b28d1] BUG: don't show [] for empty series. close #2775
git bisect good 9faec8293aa7fda64b85f3aa05bff5600c7b28d1
# good: [9faa8defe954f89a69dfc9b48d7ec1cfcb9936d8] Merge branch 'dtypes_bug' of https://github.com/jreback/pandas into jreback-dtypes_bug
git bisect good 9faa8defe954f89a69dfc9b48d7ec1cfcb9936d8
# bad: [49e123f95d95b70d10f5e5852d2682741799aed1] RLS: fixup RELEASE.rst links
git bisect bad 49e123f95d95b70d10f5e5852d2682741799aed1
# bad: [cb5e7dd2f0c96108e84e930b86bc522f8c613cfb] Merge pull request #2871 from jreback/dtypes1
git bisect bad cb5e7dd2f0c96108e84e930b86bc522f8c613cfb
# bad: [f9dfa811aa1a925e43289491558d95e1e2a2e000] RLS: add release notes for 'opt-take-2' and 'fix-maybe-convert-objects'
git bisect bad f9dfa811aa1a925e43289491558d95e1e2a2e000
# good: [759b0db9436999877791e25bf524c9c40202bc6d] ENH: Consolidate and further improve performance of take functions
git bisect good 759b0db9436999877791e25bf524c9c40202bc6d
# bad: [10f65e02536b4dda0ab7761cb1d10e274b208cfb] ENH: optimize Cython take functions with memmove where possible
git bisect bad 10f65e02536b4dda0ab7761cb1d10e274b208cfb
@y-p thxs
I remember that thread. "is memmove slower then memcpy".
hmm..well, that's weird, i also have a vb_suite diff that's memmove
vs memcpy
and everything there is +-10% (with most with +-5%, basically normally distributed around 1.0) ...maybe there's a 32-bit vs 64-bit issue...i will take a look though...i'll run a vb_suite with just that commit on my machine
@jreback can you give me info about the machine you're using to test this (cython version, gcc version, etc.?) this commit literally does nothing but change an explicit loop into a single memmove
call, so it's all up to what cython and the compiler do with that: any reasonable compiler should inline memmove
to be just as fast or faster than a loop but I suppose there's a chance it might not...I suspect this test is creating some degenerate condition where memmove
is being called on a short array over and over again and the compiler is not inlining correctly (i.e. it happens to exactly be the transpose of the ideal shape for the the optimization, so that the overhead of the call actually exceeds that of the original loop); if that's the case I can add an explicit check to avoid that for short arrays
As you posted elsewhere, the x64 case is probably using SIMD / compiler optimizations
Linux goat 2.6.32-5-amd64 #1 SMP Sun Sep 23 10:07:46 UTC 2012 x86_64 GNU/Linux
Cython is 0.17.2
[goat-jreback-~/pandas] gcc -v
Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-8' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.4 --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-8)
@stephenwlin did a great writeup for those who are interested
http://mail.python.org/pipermail/pandas-dev/2013-March/000008.html
going to close this issue for new, created #3114 to mark that we need a method for creating various type of c-contig/f-contig test cases in an easy way
@stephenwlin , that was a formidable piece.
Found out it had to do with the fact that the width of the array to be copied is not compile-time constant: I posted a question on StackOverflow about this http://stackoverflow.com/questions/15554565/forcing-gcc-to-perform-loop-unswitching-of-memcpy-runtime-size-checks
@stephenwlin you are deep into the c-code! (that's a good thing!)
@jreback, can you vbench #3130 ? I went the magic number route (for now), 128 bytes ended up slowing another test (didn't look into the particulars of it, though) so I'm going with 64 bytes for now. EDIT apparently I got some things mixed up here, #3130 as tested and submitted has a 256 byte limit and it doesn't have any downside, so I'm going with that
testing now....i'll post on that thread
@jreback actually, i got my commits mixed up, turns out this was 256 bytes (it's the largest limit that doesn't have negative impacts on other tests, which is what we want)...I amend the comment and the issue but the commit was good (I verified the hash from my vb_suite log, just in case)
f6ac8c091dfb2bbbbb02a34c13022c5427b290eb to 05a737dbb9f857e25a211739d90c42f3d9428b8e
Bad news:
Good news:
???
Fixed: