Closed GoogleCodeExporter closed 9 years ago
Yes, it would be great to have some AVX optimizations.
Stan Melax did some nice work on AVX optimized cloth:
See http://software.intel.com/en-us/articles/avx-cloth and
http://software.intel.com/en-us/articles/intel-graphics-developers-guides/
Perhaps it inspires someone to contribute?
Original comment by erwin.coumans
on 13 Jan 2011 at 9:56
Issue 404 has been merged into this issue.
Original comment by erwin.coumans
on 13 Jan 2011 at 9:56
btSequentialImpulseConstraintSolver::resolveSingleConstraintRowGenericSIMD
SSE version could improve following issue by avx,
1. replace set1 by broadcast(avx only)
2. replace the dot product by vdotps(avx only) instead of 4 shuffle.
3. merge deltaVel1Dotn,deltaVel2Dotn calculation to 256
4. merge linearComponentA|B calculation to 256
Original comment by liangma...@gmail.com
on 13 Jan 2011 at 11:23
Attached is the Intel AVX cloth demo recompiled using emulated AVX, so it
should run on systems that don't have AVX. The 4-wide (128 bit) SSE version
runs pretty decent.
Original comment by erwin.coumans
on 15 Jan 2011 at 9:19
Attachments:
Running your non avx version I get 60fps constant on a phenom x6 @ 3.64ghz.
Seems a nice speed to me! : ]
Original comment by tpant...@gmail.com
on 16 Jan 2011 at 12:00
Attached is Stan Melax' avx cloth source code with my modified avx emulation
header file (avxintrin_mini.h), derived from Intel's original header file:
http://software.intel.com/en-us/articles/avx-emulation-header-file/
Note that the __emu_mm256_unpackhi_ps/__emu_mm256_unpacklo_ps is brute force,
it could be optimized using SSE.
You can use cmake to create project files for most visual studio versions, it
will statically link all (instead of using DLLs).
Original comment by erwin.coumans
on 17 Jan 2011 at 8:25
Attachments:
[deleted comment]
[deleted comment]
Attached a new compiled version for the Intel non-avx cloth demo windows, to
avoid the MSVC100.dll missing error.
Original comment by erwin.coumans
on 4 Feb 2011 at 12:03
Attachments:
I setup a platform i72600k, win7 sp1 32bit, visual c++ 2010 express.
this platform support avx develop and run time env. Could any give some
suggestion how to use the profile infrastructure to estimate the performance?
Original comment by liangma...@gmail.com
on 30 Sep 2011 at 9:35
See https://github.com/bulletphysics/bullet3/issues/135
Original comment by erwin.coumans
on 30 Mar 2014 at 7:40
Original issue reported on code.google.com by
liangma...@gmail.com
on 13 Jan 2011 at 3:52