Closed rouson closed 4 years ago
@gutmann I'm tagging you so you'll get updates on this issue. Notice that this issue occurs with all versions tested, including 6.4.0. Either the error creeped in after 6.3.0 or we luckily circumvented it or we had an undetected, silent failure.
@scrasmussen The send-get/alloc_comp_multidim_shape.F90 unit test provides a more comprehensive test of the feature required to close this issue.
Just an update on where I am, I think shape
isn't working because there's an issue with the array indexing, get_data
in mpi_caf.c
is probably fetching the wrong memory (seems to just be slightly off). With shape
and indexing in the following example, it worked with bar%x
but breaks with bar%x(:,:)[i]
. In the example I tried to show the three different behaviors I was getting, non-deterministic numbers returned from indexing, seg fault, and infinite printing; all which point to array indexing going out of bounds, probably off by one.
@vehre were you seeing any strange array indexing behavior with your fixes?
Anyway I'll work on the indexing but wanted to give an update because this bug might pop up in other issues.
OpenCoarrays Version: 3d485eafdeffe35e99717211feb0441ad033b312
Fortran Compiler: GCC with gcc-8-branch
version: 8.1.1 20180507
MPI library being used: MPICH 3.3b1
Compiled and ran the following program with
caf -g -O0 index-bug.F90 -o runMe.exe
cafrun -np 1 ./runMe.exe
program main
implicit none
type foo
integer, allocatable :: x(:,:)[:]
end type
integer, allocatable :: air(:,:)
type(foo) :: bar
logical :: infinite_print, seg_fault
allocate(bar%x(2,1)[*])
allocate(air(2,1))
if (this_image() == 1) then
bar%x(1,1)[1] = 4
bar%x(2,1)[1] = 7
end if
sync all
infinite_print = .FALSE. !.TRUE.
seg_fault = .FALSE. ! .TRUE.
if (infinite_print) then
print* , "==========="
print *,this_image(), "has", bar%x(:,:)[1]
else if (seg_fault) then
!! comment seg_fault to .TRUE., infitite_print to .FALSE. and uncomment to get seg fault
!! NEXT TWO LINES COMMENTED OUT TO GET INFINITE LOOP
! air = bar%x(:,:)[1]
! print *,this_image(), "has", bar%x(:,:)[1]
else ! NON DETERMINISTIC VALUE IN bar%x(2,1)
air = bar%x(:,:)[1]
print *,this_image(), "has", air(1,1), air(2,1)
end if
end program
@scrasmussen are you building using OpenCoarrays from https://github.com/sourceryinstitute/OpenCoarrays/pull/528/commits/3d485eafdeffe35e99717211feb0441ad033b312 or from https://github.com/sourceryinstitute/OpenCoarrays/pull/531/commits/7d6d24ff30d0a3fa20ccc30e105865ca99d949dd ? Master will not work with GCC >= 8 (at least not until we merge Andre's PR, but we need to clean it up to work with GFortran 7.1 - 7.3 first.
@zbeekman yeah sorry about the misleading info, I'm using the 3d485eafdeffe35e99717211feb0441ad033b312 commit, I just put the wrong one in my previous message
This [alloc_comp_multidim_shape] test now passes with a patched GCC 8.1.0. Wow! Great work, @scrasmussen. I'm closing this issue.
@gutmann This fixes one issue that was blocking Coarray ICAR, but my tests with a patched GCC 8.1.0 lead to a runtime error in the OpenCoarrays send_by_ref
function so we should attempt to isolate the remaining issue. I'll tag you when I reopen a related issue that I just closed. There have been a number of improvements to send_by_ref
lately so hopefully the issue is not too difficult to find and fix.
We may also have @neok-m4700 to thank in PR #531. I know he has been trouble shooting a lot of cobounds/codim issues recently, for which we are very grateful! Props to @scrasmussen too for all of his great work!
Credit due where credit deserved, @vehre's changes fixed the [alloc_comp_multidim_shape]. Thanks for that!
I'm reopening this issue since for any allocate(bar%x(N,1)[*])
such that N > 1
, it gives the wrong answer. It's having an issue if there is a dimension of size 1 and any proceeding dimensions are greater than 1. I'll continue to look into this issue.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
@afanfa I just put code online here demonstrating what we ultimate need to work once this issue gets fixed. If the code executes correctly, it prints "Test passed." Currently, Intel 18 compiler compiles the code correctly.
$ ifort -coarray=shared -coarray-num-images=8 intel-18-works.f90
$ ./a.out
Test passed
$ ifort -V
Intel(R) Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 18.0.3.222 Build 20180410
Copyright (C) 1985-2018 Intel Corporation. All rights reserved.
Sadly, this code generates an internal compiler error in gfortran
9.2.0, which means there's definitely a compiler bug. However, I can probably write a version that's not too different that at least compiles.
The PR made by @neok-m4700 is not enough to fix this problem. In fact, the test code provided by @rouson generates an internal compiler error with the current gcc-trunk (10.0.1).
A future pull request will add a unit test that exposes this bug in a more complete way than the small reproducer below.
Defect/Bug Report
When compiled with GCC 6.4, 7.3, and 8.0.1, OpenCoarrays returns the incorrect shape of a coindexed variable even in single-image execution.
2.0.0-rc1-14-g3af39fa
install.sh
uname -a
:Linux sourcery-VirtualBox 4.13.0-16-generic #19-Ubuntu SMP Wed Oct 11 18:35:14 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Observed Behavior
Expected Behavior
Steps to Reproduce
To reproduce this problem, the
shape
argument mustThis appears to be an OpenCoarrays bug and is unrelated to the compiler's
shape
intrinsic function. Any reference to a coindexed variable yields an array with incorrect extents. For example, if an array that meets the above criteria is assigned to a non-coarray allocatable array, the latter array acquires the wrong shape through automatic (re)allocation.CONTRIBUTING.md