Closed fvaccari closed 1 year ago
did you as yet figure out the critical size for your array that leads to link failure (just drop a few zeros until it works)?
No, I'll do some tests...
Ok, I've found a program ready for use among those prepared by one of my former colleagues:
program max_size_array_double
implicit none
integer(kind=8) :: n,m
real(kind=8), allocatable, dimension(:) :: vec
n=1
do
m=2**n
allocate(vec(m))
deallocate(vec)
print*,'m=', m
n=n+1
end do
end program
The execution outcome is this:
> ./max_size_array_double
m= 2
m= 4
m= 8
m= 16
m= 32
m= 64
m= 128
m= 256
m= 512
m= 1024
m= 2048
m= 4096
m= 8192
m= 16384
m= 32768
m= 65536
m= 131072
m= 262144
m= 524288
m= 1048576
m= 2097152
m= 4194304
m= 8388608
m= 16777216
m= 33554432
m= 67108864
m= 134217728
m= 268435456
m= 536870912
m= 1073741824
m= 2147483648
m= 4294967296
m= 8589934592
m= 17179869184
m= 34359738368
m= 68719476736
m= 137438953472
m= 274877906944
m= 549755813888
m= 1099511627776
m= 2199023255552
m= 4398046511104
m= 8796093022208
In file 'max_size_array_double.f90', around line 11: Error allocating 140737488355328 bytes: Cannot allocate memory
Error termination. Backtrace:
#0 0x1030c2cef
#1 0x1030c3783
#2 0x1030c39d3
#3 0x102d0fc4b
#4 0x102d0fd47
No error about libSystem.B.dylib, possibly because of the dynamic allocation. So I'll explore more with the static (and possibly use 'allocate' in the original troubled program...)
Indeed, I think the static allocation of the large array is exactly the issue that leads to running out of link address room, but I’m too dumb to really know for sure and we await the guru.
your colleague’s program is something completely different, I believe… you’re just running out of memory there.
> gfortran-mp-devel -mcmodel=large prog.f90
At this time, we do not support -mcmodel=large
(actually, not on x86_64 either) .. probably it would be kinder to emit an error message than the failed relocations....
... the right course of action (as you have determined later in this thread) is to use dynamic allocation - that should only be limited by the memory on your system.
We will look into implementing the large model at some point (but there are higher priorities for the main port correctness at present).
did you as yet figure out the critical size for your array that leads to link failure (just drop a few zeros until it works)?
So, I've set up a shell script test.sh
#!/bin/bash
set -e
SIZE=1
for ((i=1;i<=40;i=i+1))
do
echo $i
SIZE=$SIZE*2
sed s/DIM/$SIZE/ origin > static_complex.f90
gfortran-mp-devel static_complex.f90 -ostatic_complex
./static_complex
done
where origin is
program max_size_array_complex_static
implicit none
integer(kind=8) , parameter :: vec_size=DIM
complex(kind=16) :: vec(vec_size)
vec=cmplx(0.0e0,0.0e0)
print *,"OK",vec_size
end program
The outcome is:
./test.sh
1
OK 2
2
OK 4
3
OK 8
...
...
24
OK 16777216
25
OK 33554432
26
dyld[4691]: dyld cache '/System/Library/dyld/dyld_shared_cache_arm64e' not loaded: syscall to map cache into shared region failed
dyld[4691]: Library not loaded: /usr/lib/libSystem.B.dylib
Referenced from: /Volumes/xHD/ndsha/Shared/Test/Rpath/static_complex
Reason: tried: '/usr/lib/libSystem.B.dylib' (no such file), '/usr/local/lib/libSystem.B.dylib' (no such file)
./test.sh: line 16: 4691 Abort trap: 6 ./static_complex
The error does not occur at the allocate statement, but at
vec=cmplx(0.0e0,0.0e0)
when the initialisation is made. If I comment that, the loop continues. Maybe obvious, but I'm learning while experimenting...
OK, so now you know the maximum size that can be allocated in the static way, and that it will not be made any larger any time soon so you will need to rewrite it as dynamic allocation if you need larger!
All done!
Check back in three to five years and perhaps the static allocation will be made larger by then, if anyone cares about that by that point...
program prog complex :: w40(300000000) w40=cmplx(0.0e0,0.0e0) print *,"OK" stop end
What I get on the arm64 Mac is:
> gfortran-mp-devel prog.f90 > ./a.out dyld[88306]: dyld cache '/System/Library/dyld/dyld_shared_cache_arm64e' not loaded: syscall to map cache into shared region failed dyld[88306]: Library not loaded: /usr/lib/libSystem.B.dylib Referenced from: /Volumes/xHD/ndsha/Shared/Test/Rpath/a.out Reason: tried: '/usr/lib/libSystem.B.dylib' (no such file), '/usr/local/lib/libSystem.B.dylib' (no such file) [1] 88306 abort ./a.out
That's a bit unfortunate - one might have hoped that the linker would have complained about this (rather than a run-time crash) ... my guess is that because of the way in which the shared libraries cache is implement on aarch64, the large memory allocation is causing an overlap between regions (but that's based on reading this text here - nothing more).
I would imagine that as you reduce the size of the complex array you'd reach a point that it works .. .. again dynamic allocation is a short-term fix, I guess.
for static allocation(s) - I would guess that the critical size is hard to determine for a general Fortran source - since the limitation could well be on the sum of the various static objects in the program (plus, quite probably the program code itself - all would need to fit into the address space reachable with the 'medium' (default) mcmodel.
.. which is a long-winded way of saying that if you determine that 2Gb is the largest array you can allocate statically, that does not mean you can have two of them :) .. you have to share the available space ...
Understanding why the static allocation works on Intel with 16 GB RAM and not on arm64 with 64 GB is beyond my capabilities. I just understand that this porting to arm64 must be an incredibly huge task, and I'm so happy that you went already so far...
I'm just curious to know if finding the equivalent of /usr/lib/libSystem.B.dylib on arm64/Monterey would magically solve the issue...
Now I'll start looking inside the original program that presented the error and try to go the dynamic way...
not finding /usr/lib/libSystem.B.dylib
is not actually the error.
It doesn't load because it can't do the relocations because it's out of memory. But dyld never thought of that possibility, so it says it can't find the library instead, as that is the error the dyld programmer DID expect.
So it's the wrong error, for the wrong issue, for something completely different.
:>
Understanding why the static allocation works on Intel with 16 GB RAM and not on arm64 with 64 GB is beyond my capabilities.
The way in which the system libraries is delivered is different on arm64 (iOS and macOS) from x86 (and even powerpc if one goes back that far) - the underlying issue as @kencu says is running out of usable heap (not running out of available RAM)
I'm just curious to know if finding the equivalent of /usr/lib/libSystem.B.dylib on arm64/Monterey would magically solve the issue...
Nope, I do not think it is actually even possible - libSystem is part of the system fixed cache.
Now I'll start looking inside the original program that presented the error and try to go the dynamic way...
sorry, that's the best course of action right now.
Ok, I'll follow that course of action, and thanks at @iains and @kenku for the help. Much appreciated!
Happy to report that after converting to dynamic allocation the largest arrays of the original program, execution proceeds smoothly till the end.
Tested on arm64 Mac (10.12.1, 64GB RAM), Intel Mac (10.6.8 96 GB RAM; 10.14.6 16 GB RAM), CentOS 6 VM hosted on Intel Mac (VirtualBox, 8GB RAM assigned).
Thanks to all who contributed!
I am going to close this, but we now have issue #100 which is asking for support for the large code model; perhaps that would also be useful in this case.
I've installed gcc-mp-devel with MacPorts
GNU Fortran (MacPorts gcc-devel 12-20220320_0+enable_stdlib_flag) 12.0.1 20220319 (experimental)
on a M1 Mac 64 GB RAM running macOS 12.2.1. Could compile and execute all the FORTRAN programs I'm working with but one, that has some large array definition.
Actually I've reproduced the issue with a very simple program:
I can compile/run this on an Intel Mac with 16 GB RAM (running Mojave, and an old gfortran (v. 7.5, but I succeeded with several versions from 4.x to 10.x on Intel Macs with OS ranging from 10.6.8 to 10.14.6), and I've no issues also an a Linux VM (CentOS 6 with gfortran 4.4.7 and just 8 GB of RAM assigned).
What I get on the arm64 Mac is:
The original program (full version) that fails to execute on my new arm64 Mac fails on the Linux VM 8 GB RAM Linux VM as well, but can be compiled/executed there provided that I add the option
-mcmodel=large
at compile time. Unfortunately that option leads to errors at compile time on any Mac (Intel or Apple Silicon) I've tried it on.
I made some more (blind...) experiments on the arm64 Mac, and described the outcome here
(https://trac.macports.org/ticket/64896#comment:5)
Not sure if they could be of any help, but thanks for your very appreciated efforts on this whole project!
Franco