Closed sebk26 closed 1 year ago
Hi @sebk26
Seems like I've wrongly treated a pointer which has only been declared. Apparently the standard says that the pointer is undefined (rather than unassociated). My bad.
The check here was mainly to prevent the unnoticed memory leaks, when the pointer that was allocated elsewhere is passed to to_array
function. It's probably an overkill to check that, I'll just have to remove this check
Hi @alexeedm
But if I understood the way fortran is handling pointers right, your check is correct.
If a pointer is associated with a target and then reassigned by e.g. the to_array
func but the target is not deallocated, this could lead to a memory leak indeed. I just don't know what Best-Practise is in terms of handling declared-but-not-assigned pointers.
I would suggest just fixing the examples and calling nullify
after declaration of output
.
@sebk26 I thought exactly the same, but seems we're wrong. At least up to Fortran 2008 the standard says:
16.5.2.2 Pointer association status
A pointer may have a pointer association status of associated, disassociated, or undefined. Its association status
may change during execution of a program. Unless a pointer is initialized (explicitly or by default), it has an
initial association status of undefined. A pointer may be initialized to have an association status of disassociated
or associated
And then:
13.7.16 ASSOCIATED (POINTER [, TARGET])
...
3 Arguments
POINTER shall be a pointer. It may be of any type or may be a procedure pointer. Its pointer association
status shall not be undefined.
So that's my mistake actually
But this means, that inserting a nullify is the proposed fix?
Correct, I just don't want to force users to do so. I'm removing the check instead
It's ok, I'll do it myself, I need to patch the issue with target
anyways. Thanks for your help!
I had a second pull request with that ready as well but I guess this is obsolete now.
Closing this issue as soon as your changes are pushed.
I've just updated the repo, please check it out. Your issues should be fixed now
Checked out and can confirm that #5 and this issue are fixed now. Thank you @alexeedm !
Thank you @alexeedm for this nice repository.
I've build you're code with GCC 11.3.0 without container using the
make_gnu.sh
script on an AMD EPYC CPU 7352.I did not used conda but pip but this should'nt be an issue here, i guess.
Environment
GCC 11.3.0 CUDA 11.8.0 NCCL 2.15.5 Python 3.10.4 PyTorch 2.0
Build script modifications to run with pip
Known Issue with gnu compiler
As mentioned by @ch21d012 in #5 adding the proposed attribute target to
value
fixes the error. (I also tried using pointer attribute instead, but this fails to compile) So I edited this in the template file./src/f90_bindings/torch_ftn.f90.templ
Error
Apparently, I get a similar error as @ch21d012 mentioned in #6.
This happens for the resnet_forward, polynomial and python_training example every time an array pointer is passed to
out_tensor%to_array(output)
.My (probably dirty) workaround
Calling a
nullify(output)
somewhere beforecall out_tensor%to_array(output)
solves the problem and the tensor is able to be converted to an array and printed.I justed wanted to leave this here, in case someone else is struggling. Let me know, if there are other fixes that I missed and with which I can also compile the pytroch-fortran bindings with the GNU GCC compiler.