Closed madmann91 closed 9 years ago
on Windows there is _aligned_alloc availabe. I have a code snippt from Sierra lying around for that. We should do that. Regarding clock_time: yes std::chrono makes perfectly sense. Can you fix for std::chrono? I'll look into the aligned_alloc thing.
We have a branch that provides windows support: https://github.com/AnyDSL/thorin/tree/vs2013-win64 Some of this has been merged into master, others not.
There is a std::high_resolution_clock, that should be usually enough for benchmarking, I guess. For the aligned allocator, I think using the _aligned_malloc function could be an option (also works on MinGW). Something like :
#if defined(_WIN32) || defined(__CYGWIN__)
#include <malloc.h>
void* thorin_alloc(size_t size) {
return _aligned_malloc(size, 64);
}
/* ... */
#else
/* ... */
#endif
I use this in sierra:
#if (_POSIX_C_SOURCE >= 200112L || _XOPEN_SOURCE >= 600)
void* aligned_malloc(size_t size, size_t alignment) { return ::aligned_alloc(alignment, size); }
#elif defined _MSC_VER
void* aligned_malloc(size_t size, size_t alignment) { return ::_aligned_malloc(size, alignment); }
#else
#error "don't know how to retrieve aligned memory"
#endif
C11's aligned_alloc is not an option. It took MS more than a decade to come up with some C99 support... No one knows when C11 is supported by MSVC
Yes, hence I suggest using the _aligned_malloc for windows, in my above comment. But you need to check for the correct symbols, otherwise that won't work MinGW/Cygwin :
#if defined(_WIN32) || defined(__CYGWIN__)
error "don't know how to retrieve aligned memory"
Quick note about this, since i wrote some related code few days ago: Limitations:
inline void *alignPointer(void *ptr, size_t align)
{
if(!align)
return ptr;
--align;
uintptr_t p = uintptr_t(ptr);
return (void*)((p + align) & ~uintptr_t(align));
}
inline void *aligned_alloc(size_t size, size_t alignment)
{
unsigned char *p = (unsigned char*)MALLOC(size+alignment); // up to
(alignment-1) bytes are needed as padding. one byte extra is for the offset
if(!p)
return NULL;
unsigned char *a = (unsigned char*)alignPointer(p+1, alignment); //
always make sure at least 1 byte is free, that will hold the offset
a[-1] = (unsigned char)(a - p);
return a;
}
inline void aligned_free(void *ptr)
{
if(!ptr)
return;
unsigned char *a = (unsigned char*)ptr;
unsigned off = a[-1]; // always >= 1
void *p = a - off;
Free(p);
}
PS: Greetings from Graz/Austria :) You don't mind me spying on the repo, right? I'm still receiving issue emails.
According to this thread, it seems std::high_resolution_clock is not that precise under windows. I think it is better to go with the QueryPerformanceCounter* functions from WinAPI just for Windows.
@leissa : Are you taking care of the aligned memory issue ? @maxmalek : That is also another option ;)
@maxmalek: thanks for the comment. I have similar code in my diploma thesis :)
Anyway, I think we should go for aligned_alloc and that's it. The pointer-fiddling trick is a cool trick but do we seriously consider building for any system which does not support aligned_alloc? We would just add more code to our codebase which do not even test.
@madmann91: Yes, I will fix this.
apparently, you can't assign multiple people for one issue...
ok, I'm a bit confused. What is the correct way to check for aligned_alloc? This is what the man page says:
Feature Test Macro Requirements for glibc (see feature_test_macros(7)): posix_memalign(): _POSIX_C_SOURCE >= 200112L || _XOPEN_SOURCE >= 600 aligned_alloc(): _ISOC11_SOURCE
But apparently, I have aligned_alloc available although _ISOC11_SOURCE is not set.
Maybe we should use posix_memalign for POSIX systems. Seems to be more standard.
I would do it that way :
#ifdef __STDC_VERSION__
#if__STDC_VERSION__ == 201112
/* ... */
#endif
#endif
But anyway, in my opinion, we should use :
when compiling with C++ __STDC_VERSION__
is not set.
Yep, hence the #ifdef. I added it after testing the above snippet.
cuda/cu_runtime.cpp
and opencl/cl_runtime.cpp
also use posix_memalign
. I'd suggest the following:
we add a function thorin::aligned_malloc
:
namespace thorin {
//...
#ifdef _ISOC11_SOURCE
void* aligned_malloc(size_t size, size_t alignment) { return ::aligned_alloc(alignment, size); }
#elif (_POSIX_C_SOURCE >= 200112L || _XOPEN_SOURCE >= 600)
void* aligned_malloc(size_t size, size_t alignment) {
void* p;
posix_memalign(&p, alignment, size);
return p;
}
#elif defined(_WIN32) || defined(__CYGWIN__)
void* aligned_malloc(size_t size, size_t alignment) { return ::_aligned_malloc(size, alignment); }
#else
#error "don't know how to retrieve aligned memory"
#endif
}
@madmann91: does this look ok?
Is thorin/runtime/common/thorin_utils.cpp
linked to all runtimes? Then, I'd put it there.
This sounds great. You may need to include malloc.h on windows though.
I tried to compile an imbatracer on Windows, and I found the following issues in the runtime :
I would suggest using the following alternatives :
With those minor changes the code seems to work, at least on the CPU.