SLICOT / SLICOT-Reference

SLICOT - A Fortran subroutines library for systems and control
BSD 3-Clause "New" or "Revised" License
47 stars 22 forks source link

fix argument mismatch error in gcc10 #1

Closed bnavigator closed 3 years ago

bnavigator commented 3 years ago

Hello,

over at Slycot we have been maintaining a fork of the SLICOT 5.0 GPL sources. Over time, we implemented some small fixes to build SLICOT with updated versions of gcc and the various BLAS/LAPACK implementations.

The individual fixes would have to be submitted to this BSD licensed repository by the respective authors, because Slycot is currently stuck on GPL, but I can start with my own contribution:

gfortran10 started to throw errors because of a rank mismatch when calling MB03OD (see also https://gcc.gnu.org/gcc-10/porting_to.html):

[  9%] Building Fortran object slycot/CMakeFiles/_wrapper.dir/src/SLICOT-Reference/src/AB13ID.f.o
/home/ben/src/Slycot/slycot/src/SLICOT-Reference/src/AB13ID.f:640:19:

  514 |             CALL MB03OD( 'QR Decomposition', N, N, E, LDE, IWORK, TOL,
      |                                                                  2
......
  640 |      $             TOLDEF, SVLMAX, DWORK, RANKE, DWORK( NR+1 ),
      |                   1
Error: Rank mismatch between actual argument at (1) and actual argument at (2) (rank-1 and scalar)
/home/ben/src/Slycot/slycot/src/SLICOT-Reference/src/AB13ID.f:787:39:

  514 |             CALL MB03OD( 'QR Decomposition', N, N, E, LDE, IWORK, TOL,
      |                                                                  2
......
  787 |      $                   IWORK( IWS ), TOLDEF, SVLMAX, DWORK( ITAU ),
      |                                       1
Error: Rank mismatch between actual argument at (1) and actual argument at (2) (rank-1 and scalar)
/home/ben/src/Slycot/slycot/src/SLICOT-Reference/src/AB13ID.f:799:25:

  514 |             CALL MB03OD( 'QR Decomposition', N, N, E, LDE, IWORK, TOL,
      |                                                                  2
......
  799 |      $                   TOLDEF, SVLMAX, DWORK, RANKA, DWORK( ISV ),
      |                         1
Error: Rank mismatch between actual argument at (1) and actual argument at (2) (rank-1 and scalar)

This PR fixes the issue.

andreasvarga commented 3 years ago

A simpler fix would be to call with TOLDEF instead with RCOND, because the call in line 514 is merely for computing the necessary workspace, so the value of TOL plays no role in this call.

bnavigator commented 3 years ago

Yes I tried that. Somehow the compiler then complains with a mismatch between REAL(8) and REAL(4) for later use of TOLDEF. So I opted to create a new variable RCOND.

andreasvarga commented 3 years ago

All variables are DOUBLE PRECISION, so I cannot understand what is the issue. In the call in line 514 TOL should be simply a dummy double precion variable to be transfered by reference and not changed during the call.

bnavigator commented 3 years ago

That's what I would think, too. Yet here is the compiler error:

cmake /home/ben/src/Slycot -G 'Unix Makefiles' -DCMAKE_INSTALL_PREFIX:PATH=/home/ben/src/Slycot/_skbuild/linux-x86_64-3.8/cmake-install -DPYTHON_EXECUTABLE:FILEPATH=/usr/bin/python3 -DPYTHON_VERSION_STRING:STRING=3.8.7 -DPYTHON_INCLUDE_DIR:PATH=/usr/include/python3.8 -DPYTHON_LIBRARY:FILEPATH=/usr/lib64/libpython3.8.so -DSKBUILD:BOOL=TRUE -DCMAKE_MODULE_PATH:PATH=/usr/lib/python3.8/site-packages/skbuild/resources/cmake -DSLYCOT_VERSION:STRING=0.4.0.38 -DGIT_REVISION:STRING=e6a94ccc3572ccbe19dae6ed70c7b349f9add6d8 -DISRELEASE:STRING=False -DFULL_VERSION=0.4.0.38.gite6a94cc -DCMAKE_BUILD_TYPE:STRING=Release
...

[  0%] Building Fortran object slycot/CMakeFiles/_wrapper.dir/src/SLICOT-Reference/src/AB13ID.f.o
/home/ben/src/Slycot/slycot/src/SLICOT-Reference/src/AB13ID.f:640:19:

  514 |             CALL MB03OD( 'QR Decomposition', N, N, E, LDE, IWORK, TOLDEF,
      |                                                                  2
......
  640 |      $             TOLDEF, SVLMAX, DWORK, RANKE, DWORK( NR+1 ),
      |                   1
Error: Type mismatch between actual argument at (1) and actual argument at (2) (REAL(8)/REAL(4)).
/home/ben/src/Slycot/slycot/src/SLICOT-Reference/src/AB13ID.f:787:39:

  514 |             CALL MB03OD( 'QR Decomposition', N, N, E, LDE, IWORK, TOLDEF,
      |                                                                  2
......
  787 |      $                   IWORK( IWS ), TOLDEF, SVLMAX, DWORK( ITAU ),
      |                                       1
Error: Type mismatch between actual argument at (1) and actual argument at (2) (REAL(8)/REAL(4)).
/home/ben/src/Slycot/slycot/src/SLICOT-Reference/src/AB13ID.f:799:25:

  514 |             CALL MB03OD( 'QR Decomposition', N, N, E, LDE, IWORK, TOLDEF,
      |                                                                  2
......
  799 |      $                   TOLDEF, SVLMAX, DWORK, RANKA, DWORK( ISV ),
      |                         1
Error: Type mismatch between actual argument at (1) and actual argument at (2) (REAL(8)/REAL(4)).
gmake[2]: *** [slycot/CMakeFiles/_wrapper.dir/build.make:809: slycot/CMakeFiles/_wrapper.dir/src/SLICOT-Reference/src/AB13ID.f.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:164: slycot/CMakeFiles/_wrapper.dir/all] Error 2
gmake: *** [Makefile:149: all] Error 2
Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/skbuild/setuptools_wrap.py", line 589, in setup
    cmkr.make(make_args, env=env)
  File "/usr/lib/python3.8/site-packages/skbuild/cmaker.py", line 496, in make
    raise SKBuildError(

An error occurred while building with CMake.
  Command:
    cmake --build . --target install --config Release --
  Source directory:
    /home/ben/src/Slycot
  Working directory:
    /home/ben/src/Slycot/_skbuild/linux-x86_64-3.8/cmake-build
Please see CMake's output for more information.
[ben@skylab:~/src/Slycot]% gfortran --version                                                                                                     [1]
GNU Fortran (SUSE Linux) 10.2.1 20201202 [revision e563687cf9d3d1278f45aaebd03e0f66531076c9]
ilayn commented 3 years ago

The new gcc probably expects -fallow-argument-mismatch flag to ignore and convert them to warnings and otherwise not forgiving for F77 compilations. More about a similar project and slightly not-nice discussion around it https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91731

And the parent discussion https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91556

ilayn commented 3 years ago

Just looked around a bit more and it seems like almost every major project with Fortran parts hit by this. From the gcc10 release notes

  • use_device_addr of version 5.0 of the OpenMP specification is now supported. Note that otherwise OpenMP 4.5 is partially supported in the Fortran compiler; the largest missing item is structure element mapping.
  • The default buffer size for I/O using unformatted files has been increased to 1048576. The buffer size for can now be set at runtime via the environment variables GFORTRAN_FORMATTED_BUFFER_SIZE and GFORTRAN_UNFORMATTED_BUFFER_SIZE for formatted and unformatted files, respectively.
  • Mismatches between actual and dummy argument lists in a single file are now rejected with an error. Use the new option -fallow-argument-mismatch to turn these errors into warnings; this option is implied with -std=legacy. -Wargument-mismatch has been removed.
  • The handling of a BOZ literal constant has been reworked to provide better conformance to the Fortran 2008 and 2018 standards. In these Fortran standards, a BOZ literal constant is a typeless and kindless entity. As a part of the rework, documented and undocumented extensions to the Fortran standard now emit errors during compilation. Some of these extensions are permitted with the -fallow-invalid-boz option, which degrades the error to a warning and the code is compiled as with older gfortran.
  • At any optimization level except-Os, gfortran now uses inline packing for arguments instead of calling a library routine. If the source contains a large number of arguments that need to be repacked, code size or time for compilation can become excessive. If that is the case, -fno-inline-arg-packing can be used to disable inline argument packing.
  • Legacy extensions:
    • For formatted input/output, if the explicit widths after the data-edit descriptors I, F and G have been omitted, default widths are used.
    • A blank format item at the end of a format specification, i.e. nothing following the final comma, is allowed. Use the option -fdec-blank-format-item; this option is implied with -fdec.
    • The existing support for AUTOMATIC and STATIC attributes has been extended to allow variables with the AUTOMATIC attribute to be used in EQUIVALENCE statements. Use -fdec-static; this option is implied by -fdec.
    • Allow character literals in assignments and DATA statements for numeric (INTEGER, REAL, or COMPLEX) or LOGICAL variables. Use the option -fdec-char-conversions; this option is implied with -fdec.
    • DEC comparisons, i.e. allow Hollerith constants to be used in comparisons with INTEGER, REAL, COMPLEX and CHARACTER expressions. Use the option -fdec.
  • Character type names in errors and warnings now include len in addition to kind; * is used for assumed length. The kind is omitted if it is the default kind. Examples: CHARACTER(12), CHARACTER(6,4).
  • CO_BROADCAST now supports derived type variables including objects with allocatable components. In this case, the optional arguments STAT= and ERRMSG= are currently ignored.
  • The handling of module and submodule names has been reworked to allow the full 63-character length mandated by the standard. Previously symbol names were truncated if the combined length of module, submodule, and function name exceeded 126 characters. This change therefore breaks the ABI, but only for cases where this 126 character limit was exceeded.

As can be seen from the third item, indeed the only options we have are either adding the flag or fixing all mismatched types

andreasvarga commented 3 years ago

Are you sure the mismatch is due to TOLDEF, because this seams to be a nonsese: in all calls the same argument is passed.

bnavigator commented 3 years ago

Are you sure the mismatch is due to TOLDEF

If it is not, then the error message from gfortran is wrong.

There is an assignment to TOLDEF in between. But even when I move the TOLDEF = TOL( 1 ) above line 514, the compiler error appears (essentially RCOND in this PR replaced with existing TOLDEF)

This is the only compiler error of this kind in SLICOT. All other routines compile fine. (There is https://github.com/python-control/Slycot/issues/111, and newer LAPACK implementations have removed DLATZM, ZLATZM, DGEGS. See the branch slicot-changes.)

andreasvarga commented 3 years ago

Vasile Sima corrected the basic issue regarding the elimination of the call with an array instead of a variable. We decided for this solution, to remain independent of any particular compiler. I hope you will manage to compile this routine using suitable options. Many thanks for sharing this aspect with us.

I am closing this PR, because the correction has been already performed.

bnavigator commented 3 years ago

Indeed, I can now build the SLICOT part of Slycot with the fix in 7b96b64.

I also found out, why my attempt to replace TOL with TOLDEF in https://github.com/SLICOT/SLICOT-Reference/pull/1#issuecomment-774787932 did not succeed: I did essentially the same as in 7b96b64 except that the changed line exceeds the FORTRAN77 line length limit. That seems to trigger the gfortran error message. I have to admit, in a very obsure way. I guess I am too young for this. :wink:

Thank you for correcting it! And many thanks for providing this library to the community!

andreasvarga commented 3 years ago

It was a refreshing experience for me. Last time I wrote Fortran programs was about 10 years ago (some mex interfaces to MATLAB based on SLICOT).

I was curious to learn a little bit about your SLYCOT project. Very interesting indeed. Especially, I like the CI feature on so many machines. This led me to an idea, which I would like to share with you.

We decided a few months ago to make SLICOT free (I pushed very strongly this decison) and finally we agreed upon to make it free and use Github as platform for the SLICOT repository.We uploaded the main sources for routines, test programs and data, so, in principle, the library can be used. However, just to perform the provided minimalistic tests,there is a long way and requires to bind LAPACK and BLAS. As I was able to understand, you automated this process for the purpose of integrating SLICOT in SLYCOT (or I am wrong?). We would like to benefit of your expertise in this area, at least by advising us how to provide more support for potential users of SLICOT. I list here two aspects which could be of potential interest:

  1. It would be great if we could provide a testing facility for SLICOT for the purpose of CI. Such facility exists also for LAPACK, but we lack experience in this area.

  2. I see one of the main usages for SLICOT as a computational kernel for control related computations in conjunction with environments like Matlab, Python, Julia, Octave. Bulding interfaces to these tools (wrappers, mex-files, etc.) involves (probably) the generation of appropriate shared libraries for different architectures and compilers. It would be usefull for many potential users to have this process largely automated.

Some information of possible interest to you and your colleagues.

SLICOT is accompanied also by a collection of mex-functions to interface with Matlab. Many served for testing purposes and some tools have unique functionality, which is not present in any other control tools. For example, we implemented for my recent book on fault detection, a collection of mex-files to serve for the implementation of some advanced control related computations based on descriptor systems. We are thinking to make these interfaces freely available too (at least those implemented by me will certainly be). And I intend to make some wrappers to Julia too. The descriptor systems tools for Matlab are freely available (presently on Bitbucket) and served for several graduate courses on model based fault diagnosis (the next 21 hours online course starts on March 8, 2021; I know this is an advertising, but possibly of interest since the main application domain is fault diagnosis in flight control).

We would appreciate if you could advise us in addressing the above two aspects. Any other help would be also very welcome.

We look forward to hear your opinion on these issues.

Andreas and Vasile

Am Mo., 8. Feb. 2021 um 21:18 Uhr schrieb Ben Greiner < notifications@github.com>:

Indeed, I can now build the SLICOT part of Slycot with the fix in 7b96b64 https://github.com/SLICOT/SLICOT-Reference/commit/7b96b6470ee0eaf75519a612d15d5e3e2857407d .

I also found out, why my attempt to replace TOL with TOLDEF in #1 (comment) https://github.com/SLICOT/SLICOT-Reference/pull/1#issuecomment-774787932 did not succeed: I did essentially the same as in 7b96b64 https://github.com/SLICOT/SLICOT-Reference/commit/7b96b6470ee0eaf75519a612d15d5e3e2857407d except that the changed line exceeds the FORTRAN77 line length limit. That seems to trigger the gfortran error message. I have to admit, in a very obsure way. I guess I am too young for this. 😉

Thank you for correcting it! And many thanks for providing this library to the community!

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/SLICOT/SLICOT-Reference/pull/1#issuecomment-775418363, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHECIHCD2JUBX72C2NPDS6BBHRANCNFSM4XH3ZYKQ .

ilayn commented 3 years ago

Well I, for one, am very happy to see you moved to GitHub so welcome on board and also I am grateful that you changed to a more permissive license. I have been waiting for this for a long time and over the years we had many discussions around SLICOT here and there. I think @pmli would remember who actually broke the news for me. Not to mention that I've been reading your articles and other materials for a very long time too.

I am not involved with the Slycot project mostly because it was GPL-licensed but interested in control libraries in Python in general. I have witnessed the same evolution with LAPACK devs too who were also a bit outside the GitHub workflows. But now an active community surrounds the codebase. I am positive that the same will happen as it becomes more visible.

I and surely many others would be willing to help setting up basic machinery if you could guide us with how the tests are structured or even if they exist in the first place.

It would be great if we could provide a testing facility for SLICOT for the purpose of CI. Such facility exists also for LAPACK, but we lack experience in this area.

Just to give a very basic introduction to the CI tools. Currently a few major vendors provide free CI/CD tools to open source codebases. GitHub itself, Azure Pipelines, CircleCI and AppVeyor are the most common ones. TravisCI was the most prominent player but lately they removed their free support so in case you saw their name that's why.

The main workflow is the Issue reports and Pull Requests and this is the annoying part that any change to be made should better be first committed as a pull request. The reason for this is that CI tools and GitHub are integrated each other hence when a new pull request is opened, CI tools receives a webhook signal and starts incorporating the changes to the build and run all the existing tests. This is very helpful to catch the initial tirivial problems should there be any and to see any existing behavior is broken. Once every test passes the CI check then codebase owners, in this case you both, can decide whether a piece of code should go in or not and click the merge button.

Obviously, the more tests there are the better it is. Hence all kinds of trivial input tests wrong strings, wrong options, wrong calls etc. can be tested and get out of the way.

Another advantage of these CI tools is the possibility of running builds against different machine architectures Linux, Win, OSX are very typical choices and some other esoteric options such as ARM and Raspbian etc. are also available.

  1. I see one of the main usages for SLICOT as a computational kernel for control related computations in conjunction with environments like Matlab, Python, Julia, Octave. Bulding interfaces to these tools (wrappers, mex-files, etc.) involves (probably) the generation of appropriate shared libraries for different architectures and compilers. It would be useful for many potential users to have this process largely automated.

Yes indeed one of the nice byproducts of these CI/CD tools is that you can save the so-called artifacts, the resulting files after the build and use them as releasing a new version. This is a bit trickier so I'll skip the details for now.

bnavigator commented 3 years ago

This is great. I opened a new Github issue [#2] for further discussion, so that it is more visible. Closed PRs are not easily found behind the "closed" link on the PR page.

andreasvarga commented 3 years ago

I thank you very much for these ample explanations. One aspect you mentioned is the constraint to use CI/CD triggered by commit and PR. I am using myself CI in my Julia projects based on Github actions, where the triggering of tests occurs at every push action, avoiding thus PRs. Could you imagine this to be feasible also for maintaining/developing the SLICOT repository?

ilayn commented 3 years ago

That is in fact the first immediate thing we all thought about. However when there are conflicting changes in different git branches it becomes very difficult to arrange what goes in first or how the commit history would look like in case you would be searching for a change that broke something 6 months ago. Not to mention the typical "Oh I forgot to include such and such" commits too.

It does feel like extra work and I think we all agree but the pitfalls it prevents justify the use of PRs greatly in my opinion. However this is just a suggestion and not really a rule. So if you feel like SLICOT doesn't need then so be it. After all, you would be the deciding body.

andreasvarga commented 3 years ago

Thanks for your comments. I would like to express some personal opinions regarding the future of SLICOT. We expect no significant development activities before us. However, we will try to correct all errors. Therefore, we would prefer using "issues" instead PRs, and all modifications of source code will be only performed by a single person, i.e. by Vasile Sima. Of course, we will wellcome any PR, if this will save maintainance efforts, but we are convinced that the other way is also viable.

Ilhan Polat notifications@github.com schrieb am Mi., 10. Feb. 2021, 19:56:

That is in fact the first immediate thing we all thought about. However when there are conflicting changes in different git branches it becomes very difficult to arrange what goes in first or how the commit history would look like in case you would be searching for a change that broke something 6 months ago. Not to mention the typical "Oh I forgot to include such and such" commits too.

It does feel like extra work and I think we all agree but the pitfalls it prevents justify the use of PRs greatly in my opinion. However this is just a suggestion and not really a rule. So if you feel like SLICOT doesn't need then so be it. After all, you would be the deciding body.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/SLICOT/SLICOT-Reference/pull/1#issuecomment-776934972, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALJDHEFOYYEAETVV3V3AUOTS6LJGHANCNFSM4XH3ZYKQ .

ilayn commented 3 years ago

Understood. Thank you for the clarification