MatteoLacki / IsoSpec

Libraries for fine isotopic structure calculator.
Other
35 stars 10 forks source link

Compile msvs #5

Closed hroest closed 5 years ago

hroest commented 5 years ago

Several fixes to compile with MSVS

michalsta commented 5 years ago

Hi,

mmap and calloc are not exactly equivalent here: calloc allocates the memory immediately, while mmap cleverly uses the MMU/page protection mechanisms to defer the actual allocation of memory until it is actually used. That way, if we replaced mmap with calloc IsoSpec would always allocate that memory immediately upon loading, while with mmap - it remains mostly free (and available for other processes) until the user starts calculating some really gargantuan structures. So I really wouldn't want to replace it with calloc unless absolutely necessary.

But, I think that the implementation of mmap that we use for mingw does actually work with MSVC, so all it would take to port it is to replace:

#ifdef __MINGW32__

with (something like)

#if  defined(__MINGW32__) || defined(_MSC_VER)

and hopefully it'll work. Sadly I don't have MSVC to try it... If it doesn't work, then it would be better to use calloc on MSVC only, and mmap wherever available, rather than replacing it on all platforms.

The unistd.h header can of course be dropped, that include is a forgotten relic of ages past ;)

hroest commented 5 years ago

I did some research and it seems that most implementations of calloc actually use mmap underneath for larger allocations (e.g. above a certain threshold). So in those cases my intuition is that calling calloc and calling mmap yourself should be equivalent. I looked a git more into glibc and this seems indeed to be the case:

See the documentation https://github.com/lattera/glibc/blob/master/malloc/malloc.c#L866 and note that the default value for using mmap is 128 kb https://github.com/lattera/glibc/blob/master/malloc/malloc.c#L596 and that the code in glibc also uses the optimization of not clearing memory that is mmapped https://github.com/lattera/glibc/blob/master/malloc/malloc.c#L3445 Note that if you keep reading, the code now tries to dynamically decide whether to use mmap or brk - but my overall takeaway is that i) the standard lib knows what its doing and the authors have thought about this hard and ii) it will be hard to do better and iii) it will probably do the right thing anyway. Using calloc could actually reduce the amount of code, allowing you to remove the file with mmap in it for Windows and make the code more portable beyond unix systems.

Some stackoverflow answers:

michalsta commented 5 years ago

The problem is that the C standard does not require the C library implementation to use mmap (or equivalent), so we cannot really depend on that behaviour, even if glibc does in this case do the right thing. That still leaves open the issue of what will the libc's of the various BSDs do, what will Apple's do, what will MSVCRT do, or even what will glibc do in future. So, I'd prefer to explicitly request the right thing whenever possible, rather than depend on the various libc implementations rightly second-guessing what I'm trying to do (in this case, allocate a large chunk of zeroed memory that 99% of users will only need the first few bytes of).

On the other hand, there is certainly a case to be made for the code to be (at least in theory) portable to any C platform, instead of just Unices and Windows (for which we have the mman.c implementation). So, let's go with using mmap whenever we have it, and falling back to calloc otherwise, sth like:

#if defined(__unix__) || defined(__unix) || \
        (defined(__APPLE__) && defined(__MACH__))
#include <sys/mman.h>
#define GOT_MMAP 1
#elif defined(__MINGW32__) || defined(_WIN32)
#include "mman.h"
#define GOT_MMAP 1
#else
#include <stdlib.h>
#endif

and then...

# if defined(GOT_MMAP)
double* g_lfact_table = reinterpret_cast<double*>(mmap(NULL, sizeof(double)*G_FACT_TABLE_SIZE, PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0));
#else
double* g_lfact_table = reinterpret_cast<double*>(calloc(G_FACT_TABLE_SIZE, sizeof(double)));
#endif

Yeah, ugly, but that's what being cross-platform while doing a bit of OS-level programming usually looks like :/

Could you please check that it works as intended with MSVC, and I'll merge it?

hroest commented 5 years ago

I checked and it works. I am not sure yet how to turn on the MANN_LIBRARY pre-processor directive on Windows, but that is not really my concern.