swihart / repeated

Non-normal repeated measurements models
GNU General Public License v2.0
0 stars 0 forks source link

Let my people ... LTO #15

Closed swihart closed 5 years ago

swihart commented 5 years ago

Prof Ripley emailed me 2019-06-01 saying fix the LTO warnings. He linked to a readme.txt:

Compilation logs for CRAN packages using x86_64 Fedora 30 Linux built with configure --enable-lto and config.site:

CFLAGS="-g -O2 -Wall -pedantic -mtune=native" FFLAGS="-g -O2 -mtune=native -Wall -pedantic" CXXFLAGS="-g -O2 -Wall -pedantic -mtune=native -Wno-ignored-attributes -Wno-deprecated-declarations -Wno-parentheses" JAVA_HOME=/usr/lib/jvm/jre-11 AR=gcc-ar RANLIB=gcc-ranlib

Look for [-Wlto-type-mismatch] warnings. In some cases these involve Fortran CHARACTER arguments where the length is passed as a 'hidden' argument at the end, giving mismatches such as

sblas.f:3951:14: note: type ‘long int’ should match type ‘void’

To work around these, define USE_FC_LEN_T and include Rconfig.h (perhaps via R.h) before including BLAS.h or Lapack.h or your own C proptypes for Fortran functions. Then amend the actual calls to include character length arguments: see the example of src/library/stats/src/rWishart.c in the R sources.

So I ctrl+f [-Wlto-type-mismatch] and find the following 2 and only 2 instances:

gausscop.f:157:8: warning: type of 'rs' does not match original declaration [-Wlto-type-mismatch] 157 | call rs(nld,nobs,v,tmp1,0,0,tmp2,tmp3,ierr) | ^ eigen.f:3313:17: note: 'rs' was previously declared here 3313 | SUBROUTINE RS(NM,N,A,W,MATZ,Z,FV1,FV2,IERR) | ^ eigen.f:3313:17: note: code may be misoptimized unless '-fno-strict-aliasing' is used

hidden.f:378:8: warning: type of 'dqrdc2' does not match original declaration [-Wlto-type-mismatch] 378 | call dqrdc2(gmod,m,m,m,1d-07,rank,qraux,pivot,work) | ^ chidden.f:461:12: note: 'dqrdc2' was previously declared here 461 | call dqrdc2(gmod,m,m,m,1d-07,rank,qraux,pivot,work) | ^ chidden.f:461:12: note: code may be misoptimized unless '-fno-strict-aliasing' is used

swihart commented 5 years ago

So after a little dinkin', I think I've concluded that my case is not the passing a character case and the hidden length issue that Prof Ripley put in the README.txt. Instead, my issue is that I'm mismatching REAL with INTEGER. I have a hunch after looking at the code that one of the 0's in the gausscop.f call needs to be 0.0 so that it is a real and not an integer. But first, I need to be able to reproduce the LTO warnings. I just sent the package to win_builder and will see if that will suffice. If not, I'll have to play around with .travis.yml here on github to set the environment to flag these LTO mofos.

swihart commented 5 years ago

Keep in mind, RTFM: suggests putting all .f files into all.f and compiling and then you get more specific error messages:

https://cran.r-project.org/doc/manuals/r-devel/R-exts.html#Using-Link_002dtime-Optimization

swihart commented 5 years ago

It's been 43 minutes since submitting to win builder. Let's "tickle travis"... ok did that and was not able to reproduce LTOs with my current, vanilla, .travis.yml. Will need to investigate that further... hopefully win builder just does it for me.

But you know what -- it probably won't. So look at one of the other packages travis and tickle again and see if you can get gcc-9 invoked.

All right, going to borrow this fella from event and change all those 6s to 9s.

# Sample .travis.yml for R projects

language: r
warnings_are_errors: true
sudo: required

matrix:
  include:
    - os: linux
      compiler: gcc
      r: release
      r_check_args: --as-cran --use-valgrind
      addons:
        apt:
          sources: ['ubuntu-toolchain-r-test']
          packages:
          - valgrind
          - ['g++-6']
      env:
        - COMPILER=g++-6
        - CC=gcc=6
        - CXX=g++-6

env:
 global:
   - CRAN: http://cran.rstudio.com

notifications:
  email:
    on_success: change
    on_failure: change

that didn't do it. So try again adding the flags that Prof Ripley put in the README.txt:

# Sample .travis.yml for R projects

language: r
warnings_are_errors: true
sudo: required

matrix:
  include:
    - os: linux
      compiler: gcc
      r: release
      r_check_args: --as-cran --use-valgrind
      addons:
        apt:
          sources: ['ubuntu-toolchain-r-test']
          packages:
          - valgrind
          - ['g++-9']
      env:
        - COMPILER=g++-9
        - CC=gcc=9
        - CXX=g++-9
        - CFLAGS="-g -O2 -Wall -pedantic -mtune=native"
        - FFLAGS="-g -O2 -mtune=native -Wall -pedantic"
        - CXXFLAGS="-g -O2 -Wall -pedantic -mtune=native -Wno-ignored-attributes -Wno-deprecated-declarations -Wno-parentheses"
        - JAVA_HOME=/usr/lib/jvm/jre-11
        - AR=gcc-ar
        - RANLIB=gcc-ranlib

env:
 global:
   - CRAN: http://cran.rstudio.com

notifications:
  email:
    on_success: change
    on_failure: change

... and that didn't work either. I can't reproduce the output then how do I fix it...?!?.

Here's crazy idea ... do a local CMD+SHFT+E and see what happens.

swihart commented 5 years ago

Here's crazy idea ... do a local CMD+SHFT+E and see what happens. -- oops, says I need a higher version of Roxygen ... but this makes me remembrer that 6.1.1 had problems for one of my other packages. Ugh. In the mean time winbuilder came in and it did not reproduce the LTO warning. Double ugh.

swihart commented 5 years ago

okay. update roxygen locally.

swihart commented 5 years ago

And then do the cmd+shft+E locally. Remember to set up options so the .Rcheck folder remains and then open file:///Users/swihartbj/github/repeated.Rcheck/00install.out in chrome.

https://stackoverflow.com/a/28308090/2727349

swihart commented 5 years ago

Looks like I will need to find a way to incoprorate those flags in teh devtools::check call

https://stackoverflow.com/a/50513702/2727349

swihart commented 5 years ago

HOld up, friend:

https://r-hub.github.io/rhub/articles/rhub.html

swihart commented 5 years ago

Agha. Only has gcc < 9.

Then the lightbulb goes off: go to the essence. Note in Ripley's out put the line:

/usr/local/gcc9/bin/gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c binnest.f -o binnest.o

Well, it looks like that on biowulf, I can "module load" gcc9.1:

module load gcc/9.1.0

and then run:

gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c binnest.f -o binnest.o

(caveat: I tried it on the event/src at 11PM and things ran. Tomorrow I'll copy/clone repeated to the biowulf cluster. then I will change to repeated/src. Then I will module load gcc9.1 and then run the gfortran lines as in Ripleys output and see if I can reproduce the LTO errors.) . Good night for now (GNFN).

swihart commented 5 years ago

Good morning!

  1. git clone repeated into the appropriate place on biowulf.
  2. cd repeated/src
  3. module load gcc/9.1.0
  4. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c binnest.f -o binnest.o
  5. (skip the gcc c code compilations)
  6. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c chidden.f -o chidden.o
  7. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c cphidden.f -o cphidden.o
  8. skip next two gcc calls
  9. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c eigen.f -o eigen.o
    1. skip 2 more gcc
    2. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c gausscop.f -o gausscop.o
    3. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c hidden.f -o hidden.o
    4. skip 2 more gcc
    5. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c logitord.f -o logitord.o
    6. skip two more gcc
    7. realize that the actual calll that generates the LTO is when we put all the .o files together... maybe I can modify and just put together the .o correpsonding to the gfortran commands I ran since I skipped gcc ...
    8. gcc -shared -g -O2 -Wall -pedantic -mtune=native -flto -fpic -L/usr/local/gcc9/lib64 -L/usr/local/lib64 -o repeated.so binnest.o chidden.o cphidden.o eigen.o gausscop.o hidden.o logitord.o -lgfortran -lm -lquadmath

AND OH MY GOD IT WORKED:

gausscop.f:157: warning: type of ‘rs’ does not match original declaration [-Wlto-type-mismatch]
  157 |          call rs(nld,nobs,v,tmp1,0,0,tmp2,tmp3,ierr)
      | 
eigen.f:3313: note: ‘rs’ was previously declared here
 3313 |       SUBROUTINE RS(NM,N,A,W,MATZ,Z,FV1,FV2,IERR)
      | 
eigen.f:3313: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used
hidden.f:378: warning: type of ‘dqrdc2’ does not match original declaration [-Wlto-type-mismatch]
  378 |       call dqrdc2(gmod,m,m,m,1d-07,rank,qraux,pivot,work)
      | 
chidden.f:461: note: ‘dqrdc2’ was previously declared here
  461 |       call dqrdc2(gmod,m,m,m,1d-07,rank,qraux,pivot,work)
      | 
chidden.f:461: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used
swihart commented 5 years ago

Houston, Texans, we have successfully reproduced the error.

On biowulf. with gcc9.

swihart commented 5 years ago

Okay.

So let's try to eliminate the first [-Wlto-type-mismatch] which is happening in call rs().

Looking closely at the corresponding lines/files in gausscop.f and eigen.f, it appears that Z (the sixth argument in eigen.f::RS() is declared as real BUT in the gausscop.f::rs() the sixth argument is 0. My plan is to change it to 0.0 and then rerun step 11 and step 17 from above.

Argh! That didn't resolve the issue. I even tried repeating step 9 and then step 11 and step 17 and ended up with same result.

Next plan:

Okay that just gave the same exact error. Let's make an all.f. Easy from the prompt:

cat binnest.f chidden.f cphidden.f eigen.f gausscop.f hidden.f logitord.f > all.f

And now do a simple compile on them.

https://stackoverflow.com/questions/2297536/how-do-i-capture-all-of-my-compilers-output-to-a-file

gfortran -fPIC -Wall -g -O2 -c all.f -o all.o &> all_output.txt

then open up in emacs

emacs -nw all_output.txt

and ctrl+s for RS( will yield nothing. But ctrl+s for rs( yields:

 6790 |          call rs(nld,nobs,v,tmp1,0,0.0,tmp2,tmp3,ierr)                                                      
      |                                                                        1                                    
Warning: Type mismatch in argument ‘z’ at (1); passed REAL(4) to REAL(8) [-Wargument-mismatch]   

So as it turns out my intuition was correct about Z needing to be 0.0. But this just made is a single precision real number. To make a double precision I need to type:

0.0D0

(source: http://www.math.hawaii.edu/~hile/fortran/fort3.htm#real

A double precision real number in GNU Fortran can be represented by up to 17 digits before truncation occurs. Double precision numbers are written in scientific notation but with D usurping the role of E. Some various ways of writing the number 12.345 as a double precision real number are 1.2345D1 , .12345D2 , .012345D3 , 12.345D0 , 12345D-3 .)

And then repeat step 11 and step 17 (with -flto). Step 17 shows I've elimined the [-Wlto-type-mismatch warning] for gausscop!

Yes!

Now, inside all_output.txt do a ctrl+S for dqrdc2(

And it is not listed. Yikes. Is it possible they are each subroutines defined (not called!) in hidden.f and chidden.f... no, no.

It is weird because it is called twice but not declared anywhere. When I run

grep -rnw . -e 'dqrdc2'

I only get the two calls. I tried uppercase as well.

Hey, didn't you remove a file a long time ago...no, that's an r file: cmcre.R.

Is it possible it is defined somewhere else, an artifact from when you originally inherited maintainer position and the Lindsey suite of packages were interconnected (?)

No!

I did search across all of GITHUB:

https://github.com/search?utf8=%E2%9C%93&q=%22SUBROUTINE+DQRDC2%28%22&type=Code

And came up with that dqrdc2 is from LINPACK. So let's look at one of them and see

https://github.com/DJJ88/R-3.3.1/blob/fb6aad195256b4ca672d15cc9401cb60a1f3a62d/src/appl/dqrdc2.f

All the types seem to match up but note that work is listed as

c work double precision(p,2). c work is a work array. c

Whereas in hidden.f says

double precision work(2*m)

maybe change it to work(2*m, 2)

swihart commented 5 years ago

So I edited hidden.f and chidden.f adding ,2 for work and work2 in the declarations and repeated the following steps:

   6. gfortran -fno-optimize-sibling-calls  -fpic  -g -O2 -mtune=native -Wall -pedantic -flto -c chidden.f -o chidden.o
 12.  gfortran -fno-optimize-sibling-calls  -fpic  -g -O2 -mtune=native -Wall -pedantic -flto -c hidden.f -o hidden.o
17.  gcc -shared -g -O2 -Wall -pedantic -mtune=native -flto -fpic -L/usr/local/gcc9/lib64 -L/usr/local/lib64 -o repeated.so binnest.o  chidden.o cphidden.o eigen.o gausscop.o hidden.o logitord.o -lgfortran -lm -lquadmath

Step 6 lead to errors. Need to put ,2 thoughout the file...? Answer: Yes. That got rid of errors.

But step 17 still showed the same -Wlto-mismatch. Darn.

What if...

What if I take LINPACK version and put into all.f and then recompiled...?

BOOMSHAKALAKAKAKAKAKAKAKAKAKALALALLAKAKALAKAKLAKLA BALKY!

YOU DID IT, SON! ICE UP, SON. ICE UP.

all.f:2961:72:                                                                                                      

 2961 |       call dqrdc2(gmod,m,m,m,1d-07,rank,qraux,pivot,work)                                                   
      |                                                                        1                                    
Warning: Type mismatch in argument ‘k’ at (1); passed REAL(8) to INTEGER(4) [-Wargument-mismatch]       

Okay so that's the 6th argument where k=rank. It is saying that rank is being seen as a real(8) and being passed as an integer(4). okay, okay. rank is integer in hidden.f but was double precision / real(8) in chidden.f

Mystery solved. Change that to integer. Repeat:

6. gfortran -fno-optimize-sibling-calls  -fpic  -g -O2 -mtune=native -Wall -pedantic -flto -c chidden.f -o chidden.o
12.  gfortran -fno-optimize-sibling-calls  -fpic  -g -O2 -mtune=native -Wall -pedantic -flto -c hidden.f -o hidden.o
17.  gcc -shared -g -O2 -Wall -pedantic -mtune=native -flto -fpic -L/usr/local/gcc9/lib64 -L/usr/local/lib64 -o repeated.so binnest.o  chidden.o cphidden.o eigen.o gausscop.o hidden.o logitord.o -lgfortran -lm -lquadmath

AND SUCCESS! No more LTOs!

but wait! was the ,2 even necessary???? Okay, undo those and then redo the immediately 3 above steps.

ANSWER: The ,2 weren't necessary. So leave them undo (original state) and do a git diff and git status to be displayed in next comment below --

swihart commented 5 years ago

SUMMARY:

  1. In gausscop.f, line 157: make 0 -> 0.0D0 for 6th argument in rs()
  2. In chidden.f, line 436: comment out real precision rank and make rank integer in line above (435)

BAM. Done. After these two edits, do the 17 steps to double check -Wlto-mismatches.

And we did it. Push to github from biowulf and then send on to CRAN from mac.

Good job, Team.

swihart commented 5 years ago

Another LTO -- the HIDDEN LTO. So CRAN submission went well but another LTO popped up and they notified me of it:

chidden.f:462:12: warning: type of ‘dqrcf’ does not match original 
declaration [-Wlto-type-mismatch]
   462 |       call dqrcf(gmod,m,rank,qraux,work3,m,invec,info,1)
       |            ^
/data/gannet/ripley/R/svn/R-devel/src/appl/dqrutl.f:29:7: note: type 
mismatch in parameter 9
    29 |       subroutine dqrcf(x, n, k, qraux, y, ny, b, info)
       |       ^
/data/gannet/ripley/R/svn/R-devel/src/appl/dqrutl.f:29:7: note: ‘dqrcf’ 
was previously declared here

What's interesting to me is that 1) This dqrcf LTO didn't show up in the original check that got repeated flag before JULY 1st; and 2) I can't reproduce it on my machine. So I feel like I'm flying blind, but will do my best to patch it up.

What's very interesting is that the call has 9 parameters (the 8th one being info and the 9th one being 1) while the declared subroutine has only 8 parameters (the 8th one being info). Yikes. What to do? Was info suppose to be set to 1? Should I delete the 9th parameter in the call and make the 8th and final parameter info= 1 ?

To answer this question, after looking at some dqrutl.f files on the internet, it seems I need to look at dqrsl.f to see how integer is being used:

info   integer.
c               info is zero unless the computation of b has
c               been requested and r is exactly singular.  in
c               this case, info is the index of the first zero
c               diagonal element of r and b is left unaltered.

Ok. Good to know. I need more info. Let's look at chidden.f call in repeated/src (here). My take is that info should stay info. I think the extra 1 in the 9th parameter was probably just ignored in the past. So Let's do that.

In summary:

swihart commented 5 years ago

So the warning is a type mismatch in parameter 9, which means that the passed 1 in the call is not matching up with how info is declared in the subroutine.

So I found a dqrutl.f (as of 2019-07-12) here: https://github.com/microsoft/microsoft-r-open/blob/master/source/src/appl/dqrutl.f

and below in case it changes:

c dqr Utilities:  Interface to the different "switches" of  dqrsl().
c
      subroutine dqrqty(x, n, k, qraux, y, ny, qty)

      integer n, k, ny
      double precision x(n,k), qraux(k), y(n,ny), qty(n,ny)
      integer info, j
      double precision dummy(1)
      do 10 j = 1,ny
          call dqrsl(x, n, n, k, qraux, y(1,j), dummy, qty(1,j),
     &               dummy, dummy, dummy, 1000, info)
   10 continue
      return
      end
c
      subroutine dqrqy(x, n, k, qraux, y, ny, qy)

      integer n, k, ny
      double precision x(n,k), qraux(k), y(n,ny), qy(n,ny)
      integer info, j
      double precision dummy(1)
      do 10 j = 1,ny
          call dqrsl(x, n, n, k, qraux, y(1,j), qy(1,j),
     &               dummy,  dummy, dummy, dummy, 10000, info)
   10 continue
      return
      end
c
      subroutine dqrcf(x, n, k, qraux, y, ny, b, info)

      integer n, k, ny, info
      double precision x(n,k), qraux(k), y(n,ny), b(k,ny)
      integer j
      double precision dummy(1)
      do 10 j = 1,ny
          call dqrsl(x, n, n, k, qraux, y(1,j), dummy,
     &               y(1,j), b(1,j), dummy, dummy, 100, info)
   10 continue
      return
      end
c
      subroutine dqrrsd(x, n, k, qraux, y, ny, rsd)

      integer n, k, ny
      double precision x(n,k), qraux(k), y(n,ny), rsd(n,ny)
      integer info, j
      double precision dummy(1)
      do 10 j = 1,ny
          call dqrsl(x, n, n, k, qraux, y(1,j), dummy,
     &               y(1,j), dummy, rsd(1,j), dummy, 10, info)
   10 continue
      return
      end
c
      subroutine dqrxb(x, n, k, qraux, y, ny, xb)

      integer n, k, ny
      double precision x(n,k), qraux(k), y(n,ny), xb(n,ny)
      integer info, j
      double precision dummy(1)
      do 10 j = 1,ny
          call dqrsl(x, n, n, k, qraux, y(1,j), dummy,
     &               y(1,j), dummy, dummy, xb(1,j), 1, info)
   10 continue
      return
      end