Closed danlipsa closed 7 years ago
I tried vcs 2.8, vcs nightly + VTK OpenGL backend (default), and vcs nightly + VTK OpenGL2 backend. . This is what I got:
vtk-cdat 7.1.0.2.10.2017.07.04.08.42.1c03d1d03b1c12d99dd090ba4c5ca078e2d9612f py27_mesalib_0 [mesalib] local mesalib 17.0.3 1 conda-forge Avg time: 0.813000631332
vtk-cdat 7.1.0.2.10.2017.06.21.11.56.1c03d1d03b1c12d99dd090ba4c5ca078e2d9612f py27_mesalib_0 [mesalib] uvcdat/label/nightly mesalib 17.0.3 1 conda-forge Avg time: 1.46560789108
vtk-cdat-nox 7.1.0.2.8 mesalib 7.6.1 2 uvcdat/label/2.8 Avg time: 0.275944304466
@doutriaux1 @aashish24 One problem is that we started to compile VTK in debug mode (2.8 was compiled in Release mode). We have to change this. However, there still seems to be a significant slowdown remaining, at least for isofill.
vtk-cdat 7.1.0.2.10.2017.07.04.09.37.1c03d1d03b1c12d99dd090ba4c5ca078e2d9612f py27_mesalib_0 [mesalib] local mesalib 17.0.3 1 conda-forge Avg time: 1.39770002365
@doutriaux1 @aashish24 The slowdown issue we see seems to be caused by
Here are the 4 runs with VCS 2.8, VCS Master, VTK 2.8 VTK Master. They all have similar times. To identify VCS, look for Start plot. This is printed from VTKPlots.py To identify VTK look for XXX Render. This is printed from vtkRenderWindow.cxx.
Master Render: 0.0259284 Master Render: 0.0218634 Start plot 2.8 gtype: isofill Start plot 2.8 gtype: text Render time: 0.000003 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000002 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000002 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000000 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: fillarea Render time: 0.000000 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Create pipeline: 0.215682 Master Render: 0.998379 Render time: 0.998429 1.21853399277 Master Render: 0.992069 Master Render: 0.917881 Master Render: 0.0189741
2.8 Render: 0.0247734 2.8 Render: 0.0229599 Start plot 2.8 gtype: isofill Start plot 2.8 gtype: text Render time: 0.000007 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: text Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000000 Start plot 2.8 gtype: text Render time: 0.000000 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000001 Start plot 2.8 gtype: fillarea Render time: 0.000001 Start plot 2.8 gtype: line Render time: 0.000000 Start plot 2.8 gtype: text Render time: 0.000000 Create pipeline: 0.206618 2.8 Render: 0.959998 Render time: 0.960048 1.17007303238 2.8 Render: 0.921597 2.8 Render: 0.903757 2.8 Render: 0.0189879
2.8 Render: 0.0263435 2.8 Render: 0.0260806 Start plot master gtype: isofill Start plot master gtype: text Render time: 0.000003 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000000 Start plot master gtype: text Render time: 0.000000 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: fillarea Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: text Render time: 0.000000 Create pipeline: 0.256525 2.8 Render: 0.973659 Render time: 0.973718 1.23442101479 2.8 Render: 0.961294 2.8 Render: 0.946593 2.8 Render: 0.0201555
Master Render: 0.0249178 Master Render: 0.0248235 Start plot master gtype: isofill Start plot master gtype: text Render time: 0.000003 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000002 Start plot master gtype: text Render time: 0.000003 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000000 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: fillarea Render time: 0.000001 Start plot master gtype: line Render time: 0.000000 Start plot master gtype: text Render time: 0.000001 Create pipeline: 0.254846 Master Render: 0.93995 Render time: 0.939997 1.19984102249 Master Render: 0.917993 Master Render: 0.919241 Master Render: 0.0201933
This is what is used by VTK: VMware, Inc. 3.0 Mesa 17.0.3 Gallium 0.4 on softpipe I think we want llvmpipe instead of softpipe. https://www.mesa3d.org/osmesa.html
Here are the results with mesa compiled with llvmpipe:
VMware, Inc. 3.0 Mesa 17.0.3 Gallium 0.4 on llvmpipe (LLVM 3.3, 256 bits) Master Render: 0.0087492 VMware, Inc. 3.0 Mesa 17.0.3 Gallium 0.4 on llvmpipe (LLVM 3.3, 256 bits) Master Render: 0.00377673 Start plot master gtype: isofill Start plot master gtype: text Render time: 0.000005 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000002 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: text Render time: 0.000000 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: line Render time: 0.000000 Start plot master gtype: line Render time: 0.000000 Start plot master gtype: fillarea Render time: 0.000001 Start plot master gtype: line Render time: 0.000001 Start plot master gtype: text Render time: 0.000001 Create pipeline: 0.259305 Master Render: 0.271823 Render time: 0.271868 0.535116910934 Master Render: 0.0628871 Master Render: 0.0485962 Master Render: 0.00186366
Master Render: 0.0628871 Master Render: 0.0485962 Master Render: 0.00186366
Looking great to me!
@aashish24 Note that we are not yet to where the old mesa was: 0.275944304466 vs 0.535116910934
@aashish24 @doutriaux1 Removing the printouts and running the test that does the average I get: Avg time: 0.287473192215 Which is approx what we got for 2.8.
this is great. So a difference of 2% I would think is within the range. @doutriaux1 what you think?
@danlipsa i will try to update the official mesa build on conda-forge and tsee the difference it makes.
copy/pasting script test for easier re-use:
import vcs
import time
import numpy
import cdms2
f=cdms2.open(vcs.sample_data+"/clt.nc")
s=f("clt")
x=vcs.init()
x.open()
x.geometry(1600,800)
def isSlowingDown(gmtype):
n = 100
print "Testing slow down for:",gmtype
maxpct = 100.
fastest = 100000.
avg = 0.
times = []
for i in range(n):
start = time.time()
gm = vcs.creategraphicsmethod(gmtype)
#gm.datawc_x1 = 0.
#gm.datawc_x2 = 2.
#gm.datawc_y1 = 0.
#gm.datawc_y2 = 2.
x.plot(gm,s)
end = time.time()
elapsed = end - start
if elapsed < fastest:
fastest = elapsed
pct = (elapsed/fastest*100.)
times.append(elapsed)
if i>5 and pct>maxpct: # skip the first 5 times to make sure system gets in groovy mode
maxpct = pct
avg += elapsed
x.clear()
x.removeobject(gm)
print elapsed
a, b = numpy.polyfit(numpy.arange(n-5),numpy.array(times[5:])/fastest,1)
print "\tMax percentage",maxpct
print "\tFastest time:",fastest
print "\tAvg time:",avg/n
print "\tFit coeff:",a,b
if abs(a)>2.E-3:
return True
return False
isSlowingDown("isofill")
@doutriaux1 what do you mean by groovy mode? You wait till you think that there is a good chance that things are cached?
@jypeter yes so that the file system cached everything, memory is allocated by process, etc...
@danlipsa I uploaded nightly vtk in release
mode rather than debug
with a pure 2.10
env:
0.242112874985
0.23365187645
Max percentage 141.900919912
Fastest time: 0.23365187645
Avg time: 0.252162251472
Fit coeff: -0.000105017047031 1.07336193973
with the exact same env but using the Release
vtk :
0.210036039352
Max percentage 110.823273177
Fastest time: 0.19441318512
Avg time: 0.209133522511
Fit coeff: -0.000138462622074 1.07028311157
with pure 2.8 I get:
0.204612970352
Max percentage 167.436051597
Fastest time: 0.184792995453
Avg time: 0.227097587585
Fit coeff: -0.00171937779642 1.29636997825
So the 2.10 avg time is actually faster!!!!
interestingly with mesalib
17.1.4
(or 17.0.3
) and vtk-cdat
in release
i get:
1.19244384766
Max percentage 115.738468471
Fastest time: 1.15388703346
Avg time: 1.18699487686
Fit coeff: -4.48654672987e-05 1.0289543097
Are we expecting mesa
to be much slower?
@williams13 maybe the mesa slowness is what makes vcdat so slow.
Mesa (rendering in software) is much slower than rendering in hardware.
ok so it makes sense. Thanks @danlipsa
@zshaheen for the amwg_diags then do not use mesa
@doutriaux1 If we don't use mesa, then will it work on a headless machine? none nox used to give an openGL error.
Just to throw a cat amongst the pidgeons.. is Mesa the only way?
https://www.phoronix.com/scan.php?page=news_item&px=Intel-OpenSWR-Mesa
@danlipsa have you heard or tried this, or whatever it may have morphed into by now?
@doutriaux1https://github.com/doutriaux1 If we don't use mesa, then will it work on a headless machine? none nox used to give an openGL error.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/UV-CDAT/vcs/issues/210#issuecomment-313837211, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ADFHwG0D2Qd7XtDlAumAWPvlb6plN1Crks5sLx4igaJpZM4ONWPy.
@durack1 @doutriaux1 @aashish24 I think OpenSWR works only with on-screen mesa. We are already using that for ParaView. https://blog.kitware.com/messing-with-mesa-for-paraview-5-0vtk-7-0/
An (much faster) option for off-screen mesa is EGL. This does not need X, but it does need a recent (three years old) NVidia graphics card and recent Linux. It works fine on my Ubuntu 16.04. https://blog.kitware.com/off-screen-rendering-through-the-native-platform-interface-egl/
thanks @danlipsa.
@danlipsa also it appears that it as already been ported back into mesa: https://github.com/OpenSWR/openswr-mesa and then: https://cgit.freedesktop.org/mesa/mesa/
17.1.4 which are using (with swrast) has been out 10days. @danlipsa should I revivie my efforts to use llvmpipe? Looking at it it appears using --disable-llvm-shared-libs
might help us.
@danlipsa I tried a local build and regular
mesa gives us:
Max percentage 117.257370576
Fastest time: 1.19739794731
Avg time: 1.2479244256
Fit coeff: -0.000892191236294 1.08191431583
but llvm
enabled gives us:
Max percentage 127.831469762
Fastest time: 0.22727894783
Avg time: 0.240919532776
Fit coeff: 0.000148631887712 1.03445848746
Which is very close from a non-mesa build! I'll try again a bit this morning, I'm hoping disabling shared-llvm libs will help.
@doutriaux1 @sankhesh Yes, we should use llvmpipe. What is the status here? Where are you blocked?
builds pass and conda packages are uploaded but so far mac segfault when installed back on my mac. Trying diable-shared-llvm now to see if it helps.
good news these number are from a fresh new mesa build using mesa produced by conda-forge:
Also I had a webex running at same time, so clean numbers might be faster
Max percentage 139.571263624
Fastest time: 0.280382156372
Avg time: 0.312298326492
Fit coeff: -0.000609585616416 1.12265862954
@danlipsa thx for the mesa-llvm this is HUGE!!
Nice! Thanks also to @zshaheen who filed the bug report @doutriaux1 who created the performance test and made everything work on all platforms including circleci and travis and @sankhesh who created the original mesa conda receipe and made mesa work with VTK on mac.
@danlipsa @doutriaux1 @zshaheen @sankhesh this is great, I wonder if it might also miraculously sort out our ongoing VNC issues?? It'll be great to give this a try..
@durack1 there was never an issue with mesa, since it is what we recommend over X11 forwarding.
From @zshaheen: In the ACME Workflow telco this Monday, June 26, 2017, we discussed issues regarding VCS 2.8 vs 2.10. In short, when using 2.10, isofills and polar projections are fairly slow. Sadly, 1D plots are also slower on 2.10 than 2.8.
Code Here's the code that we use to create the plots: https://github.com/zshaheen/acme_diags/tree/master/acme_diags/plot/vcs set3.py are the 1D plots, set4.py and set5.py are the isofills, set7.py is the polar projections. Look for the plot() function.
Envs Also here are the envs used: 2.8 env: vcs-nox 2.8.2017.04.04.16.48.c22e1d924a0145ef526b56a8ca123484adfd660a 0 uvcdat/label/nightly vcsaddons-nox 2.8 py27_2 uvcdat vtk-cdat-nox 7.1.0.2.8.2017.03.23.16.48.73fb094d17ccdd594a6abf9ff8732149c88dc2e6 py27_0 uvcdat/label/nightly mesalib 7.6.1 0 conda-forge dv3d-nox 2.8 py27_2 uvcdat
2.10 env: vcs 2.10 2 uvcdat vtk-cdat 7.1.0.2.10 py27_mesalib_0 [mesalib] uvcdat mesalib 17.0.3 1 conda-forge dv3d 2.10 py27_0 uvcdat
Results Now we ran 10 of each plots and timed them. Below are the results: 1D plots (set3.py) is 2.256 times slower on 2.10 than 2.8. This is even with the fix. This isn't so great. Isofills (set4.py and set5.py) are 1.247 and 4.803 times slower respectively on 2.10 than 2.8. This also isn't so great. Polar projections (set7.py) is 3.236 times slower on 2.10 than 2.8. Also, this isn't so great.
Any ideas? We really appreciate your guy's help on this.
From @doutriaux1
Guys,
I got smething.
Using the attached test and the yml env at: https://github.com/UV-CDAT/uvcdat/wiki/install
NOTE this is done w/o mesalib and ALL 2.10 vtk-cdat do NOT render at the correct geometry (bug for you Aashish and Dan)
isofill with 2.8: 0.18433380127 Max percentage 119.696913801 Fastest time: 0.180957794189 Avg time: 0.194927251339 Fit coeff: -0.00128569424834 1.12106129005
isofill 2.8 and vcs nightly Max percentage 136.229411851 Fastest time: 0.212167024612 Avg time: 0.241551113129 Fit coeff: -0.000293949286773 1.13474387474
isofill with 2.10 Max percentage 162.54471857 Fastest time: 0.217200994492 Avg time: 0.243273427486 Fit coeff: 0.000875383643704 1.06942886163
isofill 2.10 but vcs nightly and vtk-cdat from 2.10 Max percentage 109.646274636 Fastest time: 0.244299173355 Avg time: 0.255787761211 Fit coeff: -0.00021131210507 1.0437535676
isofill l2.10 vcs and vtk-cdat from nightly Max percentage 155.385235308 Fastest time: 0.256790161133 Avg time: 0.26979429245 Fit coeff: -9.85009650669e-05 1.04153697489
speed_test.py.txt