analyze() breaks when called twice on the same variable

GoogleCodeExporter commented 9 years ago

When working with a bunch of fortran files, each containing a module and 
modules USEing each 
other it is very well possible that a variable in a module is parsed and 
analyzed more than once.. 

that happens at least when one module is used more than once by different 
modules.

When that happens, analyze raises an exception. I have traced the problem back 
to 
Variable.update(self, *attrs) line 304 (file base_classes.py). The assert on 
this line failes if the 
variable was parsed and analyzed before. What I find a bit strange is that this 
assertion only 
exists for the dimension modifier in the type declaration of a variable, not 
for the other ones

I suggest to do one of the following two: either remove this assertion (I don't 
see its use), or not 
reanalyze a variable if it exists already in the attribute dictionary of its 
container. However I don't 
quite understand yet how those data structures are managed and hence I am not 
sure if this is 
really a good idea. At least for me, removing the assertion seems to fix the 
problem.

thanks,

omar

Original issue reported on code.google.com by omar.aw...@gmail.com on 18 Apr 2010 at 2:50

GoogleCodeExporter commented 9 years ago

I think it would be much better to not reanalyze existing entities in the parse 
tree(s), because that would be 
much more efficient...

Original comment by omar.aw...@gmail.com on 18 Apr 2010 at 2:51

GoogleCodeExporter commented 9 years ago

Yes, modules should be analyzed only once.
Could you provide an example code that illustrates this issue?

Original comment by merit.pe...@gmail.com on 18 Apr 2010 at 3:26

GoogleCodeExporter commented 9 years ago

oh.. hold on... i figured it out.. the assertion error happens actually not 
because of multiple USEs but only 
because the array is defined in a derived type defintion.
Here is a little example:

-----
MODULE test

TYPE t
    integer, dimension(:,:), pointer :: a
END TYPE t

END MODULE test
-----

I have fixed the bug, see the diff in the block_statements.py file. The 
original code calls first 
BeginStatement.analyze(self) on the Type instance, which will call in turn the 
analyze function on all elements 
of content in this instance. But after analyzing the elements of spec the code 
iterates again over content, 
which causes the assertion failure. This assertion is only checked when the 
field in the derived type is an array 
(i.e. has the dimension keyword) So we either have to remove this assertion or 
move it to another place in the 
update routine where it would be checked for all variables, not only arrays.

I atteched a diff with a couple more changes, I caught 1 or 2 other things that 
I think are bugs.

After applying those patches I still see one possible problem. Even though the 
parser seems to be handling it 
correctly, I still think that if a module A in used in modules B and C, and 
both B and C are included in D, then 
A is loaded twice.. that could be avoided, right? I've attached an example.. if 
you read, parse and analyze 
testd.f90 you'll see what I mean..

Thanks!

Original comment by omar.aw...@gmail.com on 18 Apr 2010 at 8:50

Attachments:

GoogleCodeExporter commented 9 years ago

The attached test.zip file seems to be empty.
I have applied your patch to svn, except the readfortran.py that probably
contained only debugging code.

Original comment by merit.pe...@gmail.com on 20 Apr 2010 at 4:26

GoogleCodeExporter commented 9 years ago

Previous comment is mine, not my wifes:)

Original comment by pearu.peterson on 20 Apr 2010 at 4:28

GoogleCodeExporter commented 9 years ago

Ok :) I assumed so..

the diff in readfortran.py is actually strange.. what I did is add source_only 
= None in the base class 
(FortranReaderBase) (in the init function) because find_module_source_file uses 
source_only, without that I got an 
exception when this function was called.
yepp.. the test.zip was broken.. I tried to pack those files again..

Original comment by omar.aw...@gmail.com on 20 Apr 2010 at 5:47

Changed state: Started

Attachments:

test.zip

GoogleCodeExporter commented 9 years ago

When parsing testd.f90, I get warnings that entities a (defined in testa)
and x (defined in testb) are already defined in module c.

This happens when processing statement `use testc`.
After analyzing the code, here is how is I understand the situation:
The entities a and x are added to testd as a result of processing
the statement `use testb`. Note that at the same time testc provides
y (irrelevant here) in addition to a and x. So, when processing
`use testc`, the analyzer tries to add the content of testc to
testd, that is, add a,x,t to testd. Now a,x are already there
due to `use testb` and that triggers the warnings.

I am looking at the fix momentarily. The fix should not to give
a warning when adding equivalent entities from different modules
and warn only if different modules defines different entities with
the same name.

Original comment by pearu.peterson on 20 Apr 2010 at 7:06

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

I just committed a fix to svn. 

Though there was a warning indicating
duplicate loading of a, it was not actually loaded twise.
The warning code just did not take into account that the
same entity could be loaded via several use paths.

Original comment by pearu.peterson on 20 Apr 2010 at 7:15

GoogleCodeExporter commented 9 years ago

Great!
One more thing..
One change I did that you did not include in the patch is adding '.f' into the 
list of module_file_extensions. Would 
it be a bug to have the .f in that list? the reason I added it, was that when 
the reader searched for module files 
(get_module_file) it didn't consider files ending with .f - however the fortran 
project I am working on is in F90 but 
still uses .f extensions for all fortran files. So, adding the .f was a quick 
fix for that, but if that would cause 
problems elsewhere then we maybe need a better solution...
Thanks!

Original comment by omar.aw...@gmail.com on 20 Apr 2010 at 10:55

GoogleCodeExporter commented 9 years ago

It seems that using the Fortran file name extensions is
not so reliable way to determine the used standard and the 
format of Fortran source code, unless it explicitely is
.f90, .f95,  f03, .f08, or .f77. As it seems, .f
is becoming more popular for f90 and newer codes though
many programs still consider .f files as Fortran 77 files.

I think it is safe to add .f to module_file_extensions,
I must forgot it.

Original comment by pearu.peterson on 21 Apr 2010 at 4:45

GoogleCodeExporter commented 9 years ago

Done!

I will close this issue..

Original comment by omar.aw...@gmail.com on 21 Apr 2010 at 8:30

Changed state: Fixed

travellhyne / f2py

analyze() breaks when called twice on the same variable #20