Closed horaceho closed 9 years ago
Hi Horace,
Thank you. I didn't notice that I leave this unportable line, sorry.
I will try to avoid as much as possible the use of vector instructions. However if, in some future I use some of them (e.g. for merging of liberties), I believe that you will have something equivalent available for ARM. However, the modification that you propose does not seem to be very general : on my mac with gcc the original version works and will be useful in order to use SSE instruction (and the APPLE macro is defined). I found something on the web (http://stackoverflow.com/questions/11228855/header-files-for-simd-intrinsics) that seems to cover everything we need and more. Does it work for you ?
/\* Microsoft C/C++-compatible compiler */
#include <intrin.h>
/\* GCC-compatible compiler, targeting x86/x86-64 _/
#include <x86intrin.h>
/_ GCC-compatible compiler, targeting ARM with NEON _/
#include <arm_neon.h>
/_ GCC-compatible compiler, targeting ARM with WMMX _/
#include <mmintrin.h>
/_ XLC or GCC-compatible compiler, targeting PowerPC with VMX/VSX _/
#include <altivec.h>
/_ GCC-compatible compiler, targeting PowerPC with SPE */
#include <spe.h>
Best, Denis
Sorry, the xxxx in the previous post have been replaced by formatted string XXXX. Please go to the original page to retrieve the source code. Denis
The main change in my pull request is to remove the line: #include <x86intrin.h>
for iOS.
GNUC is defined by default in iOS environment. #include <x86intrin.h>
did work when the iOS app runs in a Simulator (which in inside a X86 MacBook) but did not work in the real devices (iPhones, which are ARM-based).
Removing #include <x86intrin.h>
will let the app run OK both in the Simulator and in real devices.
Since I am not sure whether is OK to skip #include <x86intrin.h>
in general GNUC environments (e.g. Linux, Windows), I isolate an section for Apple specific environment.
IMHO, #elif defined(__GNUC__) && defined(__ARM_NEON__)
may not be best fit for iOS environment, as I am not sure whether __ARM_NEON__
covers Apple CPUs (A6, A7, A8) good enough...
On 11/06/2015 03:21, Horace Ho wrote:
The main change in my pull request is to remove the line: |#include
| for iOS. Dear Horace,
OK. The main point in my comment is that I believe that it also removes the line for mac OS X.
But OK. Let's do that and we will revisit the question if we need to.
Best Denis
Hi Horace,
Do you have used the different versions of michi-c2 I put on GitHub ? Are there working properly for you ?
Best, Denis
Hi Denis,
I've pulled the new update and tested on my iPhone. The program works in general as before when N=13. However, there is a crash when N=19, I suspect it is caused by limited stack memory on the iPhone.
Thanks horace
Hi Horace,
Thank you for your reply.
Yes, I believe that the stack size could be a problem with the growth of the Position structure induced by tracking blocks and liberties.
On my system with a stack size of 8192 kbytes, I have no problem running 19x19 games against gnugo.
I will try to remedy this problem (hopefully without slowing down the engine). Do you know the size of the stack that is available on the Iphone ?
Best, Denis
On 18/06/2015 10:10, Horace Ho wrote:
Hi Denis,
I've pulled the new update and tested on my iPhone. The program works in general as before when N=13. However, there is a crash when N=19, I suspect it is caused by limited stack memory on the iPhone.
Thanks horace
— Reply to this email directly or view it on GitHub https://github.com/db3108/michi-c2/pull/2#issuecomment-113069271.
Hi Denis,
I briefly tested and it looks like the stack is 512 kbytes on my iPhone 6 Plus.
Thanks horace
On 19/06/2015 05:13, Horace Ho wrote:
I briefly tested and it looks like http://stackoverflow.com/questions/30929305/is-the-stack-size-of-iphone-fixed the stack is 512 kbytes on my iPhone 6 Plus. Hi Horace,
Thanks for the info and the link. 512 kbytes is not big. I will try to make wichi work (hopefully without slow down) with this limitation (I noticed that it is still hypothetic).
I'am curious about the michi's behavior on the IPhone. Is it really playable ? Do you know how many playouts / sec you obtained on this hardware ? (time for michi mcbenchmark, or play with a fixed time limitation and look in michi.log).
Best, Denis
Hi Denis,
Here is the benchmark result (roughly 4.3 seconds on my iPhone 6 Plus) :
Start time: 2015-06-22 12:17:06.116
end time: 2015-06-22 12:17:10.450
Details:
I 0/000 Loading pattern probs ...
I 0/000 Loading pattern spatial dictionary ...
I 0/000 read 1064481 patterns
I 0/000 idmax = 1064481
I 0/000 pattern length max = 141 (found at 1064481)
I 0/000 =========== Hashtable initialization synthesis ==========
I 0/000 hashtable entries: 8483962 (fill ratio: 25.3 )
I 0/000 8515848 searches, 31886 success (0.4 )
I 0/000 average length of searchs -- success: 0.7, failure: 1.9
2015-06-22 12:17:06.116 Mingo[1387:163009] -[Michi benchmark]
0 .......... .......... .......... .......... ..........
50 .......... .......... .......... .......... ..........
100 .......... .......... .......... .......... ..........
150 .......... .......... .......... .......... ..........
200 .......... .......... .......... .......... ..........
250 .......... .......... .......... .......... ..........
300 .......... .......... .......... .......... ..........
350 .......... .......... .......... .......... ..........
400 .......... .......... .......... .......... ..........
450 .......... .......... .......... .......... ..........
500 .......... .......... .......... .......... ..........
550 .......... .......... .......... .......... ..........
600 .......... .......... .......... .......... ..........
650 .......... .......... .......... .......... ..........
700 .......... .......... .......... .......... ..........
750 .......... .......... .......... .......... ..........
800 .......... .......... .......... .......... ..........
850 .......... .......... .......... .......... ..........
900 .......... .......... .......... .......... ..........
950 .......... .......... .......... .......... ..........
1000 .......... .......... .......... .......... ..........
1050 .......... .......... .......... .......... ..........
1100 .......... .......... .......... .......... ..........
1150 .......... .......... .......... .......... ..........
1200 .......... .......... .......... .......... ..........
1250 .......... .......... .......... .......... ..........
1300 .......... .......... .......... .......... ..........
1350 .......... .......... .......... .......... ..........
1400 .......... .......... .......... .......... ..........
1450 .......... .......... .......... .......... ..........
1500 .......... .......... .......... .......... ..........
1550 .......... .......... .......... .......... ..........
1600 .......... .......... .......... .......... ..........
1650 .......... .......... .......... .......... ..........
1700 .......... .......... .......... .......... ..........
1750 .......... .......... .......... .......... ..........
1800 .......... .......... .......... .......... ..........
1850 .......... .......... .......... .......... ..........
1900 .......... .......... .......... .......... ..........
1950 .......... .......... .......... .......... ..........
-1.801500
2015-06-22 12:17:10.450 Mingo[1387:163009] -[Michi benchmark]
X X X X X X O . O . O O O
X X . X X O O O . O O . O
X . X O O O O O O O . O O
X X X O O O . O . O O . O
X X X O X O O . O O O O .
X X X X X X O O O O . O O
O X . X X X O O . O O O .
O O X X X X O O O O O . O
O O X X X O O O O O . O O
. O O X O X X X O . O . O
O O O O O X . X X O O O O
O . O O O O X . X O O O .
O O O O . O X X X O O . O
On 22/06/2015 06:23, Horace Ho wrote:
Here is the benchmark result (roughly 4.3 seconds on my iPhone 6 Plus) : Hi Horace,
Thank you for the info. This is really good ! I'am impressed.
For what concerns the stack size problem, I believe I will be able to provide an update of the 1.3 version on Github for you to test in the next days. (PS. I can test it on my system by using a 512 kbytes stack size limit).
Best Denis.
include is not available on iOS device (ARM). Apple-specific (iOS and Mac OS) optimisations are grouped inside APPLE (instead of GNUC).