Closed swihart closed 5 years ago
So after a little dinkin', I think I've concluded that my case is not the passing a character case and the hidden length issue that Prof Ripley put in the README.txt. Instead, my issue is that I'm mismatching REAL
with INTEGER
. I have a hunch after looking at the code that one of the 0's in the gausscop.f call needs to be 0.0 so that it is a real and not an integer. But first, I need to be able to reproduce the LTO warnings. I just sent the package to win_builder and will see if that will suffice. If not, I'll have to play around with .travis.yml here on github to set the environment to flag these LTO mofos.
Keep in mind, RTFM: suggests putting all .f files into all.f and compiling and then you get more specific error messages:
https://cran.r-project.org/doc/manuals/r-devel/R-exts.html#Using-Link_002dtime-Optimization
It's been 43 minutes since submitting to win builder. Let's "tickle travis"... ok did that and was not able to reproduce LTOs with my current, vanilla, .travis.yml. Will need to investigate that further... hopefully win builder just does it for me.
But you know what -- it probably won't. So look at one of the other packages travis and tickle again and see if you can get gcc-9 invoked.
All right, going to borrow this fella from event
and change all those 6s to 9s.
# Sample .travis.yml for R projects
language: r
warnings_are_errors: true
sudo: required
matrix:
include:
- os: linux
compiler: gcc
r: release
r_check_args: --as-cran --use-valgrind
addons:
apt:
sources: ['ubuntu-toolchain-r-test']
packages:
- valgrind
- ['g++-6']
env:
- COMPILER=g++-6
- CC=gcc=6
- CXX=g++-6
env:
global:
- CRAN: http://cran.rstudio.com
notifications:
email:
on_success: change
on_failure: change
that didn't do it. So try again adding the flags that Prof Ripley put in the README.txt:
# Sample .travis.yml for R projects
language: r
warnings_are_errors: true
sudo: required
matrix:
include:
- os: linux
compiler: gcc
r: release
r_check_args: --as-cran --use-valgrind
addons:
apt:
sources: ['ubuntu-toolchain-r-test']
packages:
- valgrind
- ['g++-9']
env:
- COMPILER=g++-9
- CC=gcc=9
- CXX=g++-9
- CFLAGS="-g -O2 -Wall -pedantic -mtune=native"
- FFLAGS="-g -O2 -mtune=native -Wall -pedantic"
- CXXFLAGS="-g -O2 -Wall -pedantic -mtune=native -Wno-ignored-attributes -Wno-deprecated-declarations -Wno-parentheses"
- JAVA_HOME=/usr/lib/jvm/jre-11
- AR=gcc-ar
- RANLIB=gcc-ranlib
env:
global:
- CRAN: http://cran.rstudio.com
notifications:
email:
on_success: change
on_failure: change
... and that didn't work either. I can't reproduce the output then how do I fix it...?!?.
Here's crazy idea ... do a local CMD+SHFT+E and see what happens.
Here's crazy idea ... do a local CMD+SHFT+E and see what happens. -- oops, says I need a higher version of Roxygen ... but this makes me remembrer that 6.1.1 had problems for one of my other packages. Ugh. In the mean time winbuilder came in and it did not reproduce the LTO warning. Double ugh.
okay. update roxygen locally.
And then do the cmd+shft+E locally. Remember to set up options so the .Rcheck folder remains and then open file:///Users/swihartbj/github/repeated.Rcheck/00install.out
in chrome.
Looks like I will need to find a way to incoprorate those flags in teh devtools::check call
HOld up, friend:
Agha. Only has gcc < 9.
Then the lightbulb goes off: go to the essence. Note in Ripley's out put the line:
/usr/local/gcc9/bin/gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c binnest.f -o binnest.o
Well, it looks like that on biowulf, I can "module load" gcc9.1:
module load gcc/9.1.0
and then run:
gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c binnest.f -o binnest.o
(caveat: I tried it on the event/src at 11PM and things ran. Tomorrow I'll copy/clone repeated to the biowulf cluster. then I will change to repeated/src. Then I will module load gcc9.1 and then run the gfortran lines as in Ripleys output and see if I can reproduce the LTO errors.) . Good night for now (GNFN).
Good morning!
AND OH MY GOD IT WORKED:
gausscop.f:157: warning: type of ‘rs’ does not match original declaration [-Wlto-type-mismatch]
157 | call rs(nld,nobs,v,tmp1,0,0,tmp2,tmp3,ierr)
|
eigen.f:3313: note: ‘rs’ was previously declared here
3313 | SUBROUTINE RS(NM,N,A,W,MATZ,Z,FV1,FV2,IERR)
|
eigen.f:3313: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used
hidden.f:378: warning: type of ‘dqrdc2’ does not match original declaration [-Wlto-type-mismatch]
378 | call dqrdc2(gmod,m,m,m,1d-07,rank,qraux,pivot,work)
|
chidden.f:461: note: ‘dqrdc2’ was previously declared here
461 | call dqrdc2(gmod,m,m,m,1d-07,rank,qraux,pivot,work)
|
chidden.f:461: note: code may be misoptimized unless ‘-fno-strict-aliasing’ is used
Houston, Texans, we have successfully reproduced the error.
On biowulf. with gcc9.
Okay.
So let's try to eliminate the first [-Wlto-type-mismatch]
which is happening in call rs()
.
Looking closely at the corresponding lines/files in gausscop.f and eigen.f, it appears that Z
(the sixth argument in eigen.f::RS() is declared as real BUT in the gausscop.f::rs() the sixth argument is 0
. My plan is to change it to 0.0
and then rerun step 11 and step 17 from above.
Argh! That didn't resolve the issue. I even tried repeating step 9 and then step 11 and step 17 and ended up with same result.
Next plan:
Okay that just gave the same exact error. Let's make an all.f. Easy from the prompt:
cat binnest.f chidden.f cphidden.f eigen.f gausscop.f hidden.f logitord.f > all.f
And now do a simple compile on them.
https://stackoverflow.com/questions/2297536/how-do-i-capture-all-of-my-compilers-output-to-a-file
gfortran -fPIC -Wall -g -O2 -c all.f -o all.o &> all_output.txt
then open up in emacs
emacs -nw all_output.txt
and ctrl+s for RS( will yield nothing. But ctrl+s for rs( yields:
6790 | call rs(nld,nobs,v,tmp1,0,0.0,tmp2,tmp3,ierr)
| 1
Warning: Type mismatch in argument ‘z’ at (1); passed REAL(4) to REAL(8) [-Wargument-mismatch]
So as it turns out my intuition was correct about Z needing to be 0.0. But this just made is a single precision real number. To make a double precision I need to type:
0.0D0
(source: http://www.math.hawaii.edu/~hile/fortran/fort3.htm#real
A double precision real number in GNU Fortran can be represented by up to 17 digits before truncation occurs. Double precision numbers are written in scientific notation but with D usurping the role of E. Some various ways of writing the number 12.345 as a double precision real number are 1.2345D1 , .12345D2 , .012345D3 , 12.345D0 , 12345D-3 .)
And then repeat step 11 and step 17 (with -flto). Step 17 shows I've elimined the [-Wlto-type-mismatch warning] for gausscop!
Yes!
Now, inside all_output.txt do a ctrl+S for dqrdc2(
And it is not listed. Yikes. Is it possible they are each subroutines defined (not called!) in hidden.f and chidden.f... no, no.
It is weird because it is called twice but not declared anywhere. When I run
grep -rnw . -e 'dqrdc2'
I only get the two calls. I tried uppercase as well.
Hey, didn't you remove a file a long time ago...no, that's an r file: cmcre.R.
Is it possible it is defined somewhere else, an artifact from when you originally inherited maintainer position and the Lindsey suite of packages were interconnected (?)
No!
I did search across all of GITHUB:
https://github.com/search?utf8=%E2%9C%93&q=%22SUBROUTINE+DQRDC2%28%22&type=Code
And came up with that dqrdc2 is from LINPACK. So let's look at one of them and see
https://github.com/DJJ88/R-3.3.1/blob/fb6aad195256b4ca672d15cc9401cb60a1f3a62d/src/appl/dqrdc2.f
All the types seem to match up but note that work is listed as
c work double precision(p,2). c work is a work array. c
Whereas in hidden.f says
double precision work(2*m)
maybe change it to work(2*m, 2)
So I edited hidden.f and chidden.f adding ,2
for work and work2 in the declarations and repeated the following steps:
6. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c chidden.f -o chidden.o
12. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c hidden.f -o hidden.o
17. gcc -shared -g -O2 -Wall -pedantic -mtune=native -flto -fpic -L/usr/local/gcc9/lib64 -L/usr/local/lib64 -o repeated.so binnest.o chidden.o cphidden.o eigen.o gausscop.o hidden.o logitord.o -lgfortran -lm -lquadmath
Step 6 lead to errors. Need to put ,2 thoughout the file...? Answer: Yes. That got rid of errors.
But step 17 still showed the same -Wlto-mismatch. Darn.
What if...
What if I take LINPACK version and put into all.f and then recompiled...?
BOOMSHAKALAKAKAKAKAKAKAKAKAKALALALLAKAKALAKAKLAKLA BALKY!
YOU DID IT, SON! ICE UP, SON. ICE UP.
all.f:2961:72:
2961 | call dqrdc2(gmod,m,m,m,1d-07,rank,qraux,pivot,work)
| 1
Warning: Type mismatch in argument ‘k’ at (1); passed REAL(8) to INTEGER(4) [-Wargument-mismatch]
Okay so that's the 6th argument where k=rank. It is saying that rank is being seen as a real(8) and being passed as an integer(4). okay, okay. rank is integer in hidden.f but was double precision / real(8) in chidden.f
Mystery solved. Change that to integer. Repeat:
6. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c chidden.f -o chidden.o
12. gfortran -fno-optimize-sibling-calls -fpic -g -O2 -mtune=native -Wall -pedantic -flto -c hidden.f -o hidden.o
17. gcc -shared -g -O2 -Wall -pedantic -mtune=native -flto -fpic -L/usr/local/gcc9/lib64 -L/usr/local/lib64 -o repeated.so binnest.o chidden.o cphidden.o eigen.o gausscop.o hidden.o logitord.o -lgfortran -lm -lquadmath
AND SUCCESS! No more LTOs!
but wait! was the ,2 even necessary???? Okay, undo those and then redo the immediately 3 above steps.
ANSWER: The ,2 weren't necessary. So leave them undo (original state) and do a git diff and git status to be displayed in next comment below --
SUMMARY:
BAM. Done. After these two edits, do the 17 steps to double check -Wlto-mismatches.
And we did it. Push to github from biowulf and then send on to CRAN from mac.
Good job, Team.
Another LTO -- the HIDDEN LTO. So CRAN submission went well but another LTO popped up and they notified me of it:
chidden.f:462:12: warning: type of ‘dqrcf’ does not match original
declaration [-Wlto-type-mismatch]
462 | call dqrcf(gmod,m,rank,qraux,work3,m,invec,info,1)
| ^
/data/gannet/ripley/R/svn/R-devel/src/appl/dqrutl.f:29:7: note: type
mismatch in parameter 9
29 | subroutine dqrcf(x, n, k, qraux, y, ny, b, info)
| ^
/data/gannet/ripley/R/svn/R-devel/src/appl/dqrutl.f:29:7: note: ‘dqrcf’
was previously declared here
What's interesting to me is that 1) This dqrcf
LTO didn't show up in the original check that got repeated
flag before JULY 1st; and 2) I can't reproduce it on my machine. So I feel like I'm flying blind, but will do my best to patch it up.
What's very interesting is that the call has 9 parameters (the 8th one being info
and the 9th one being 1
) while the declared subroutine has only 8 parameters (the 8th one being info
). Yikes. What to do? Was info suppose to be set to 1? Should I delete the 9th parameter in the call and make the 8th and final parameter info= 1
?
To answer this question, after looking at some dqrutl.f files on the internet, it seems I need to look at dqrsl.f to see how integer is being used:
info integer.
c info is zero unless the computation of b has
c been requested and r is exactly singular. in
c this case, info is the index of the first zero
c diagonal element of r and b is left unaltered.
Ok. Good to know. I need more info. Let's look at chidden.f call in repeated/src
(here). My take is that info should stay info. I think the extra 1 in the 9th parameter was probably just ignored in the past. So Let's do that.
In summary:
dqrcf(gmod,m,rank,qraux,work3,m,invec,info,1)
-> dqrcf(gmod,m,rank,qraux,work3,m,invec,info)
So the warning is a type mismatch in parameter 9, which means that the passed 1
in the call is not matching up with how info
is declared in the subroutine.
So I found a dqrutl.f (as of 2019-07-12) here: https://github.com/microsoft/microsoft-r-open/blob/master/source/src/appl/dqrutl.f
and below in case it changes:
c dqr Utilities: Interface to the different "switches" of dqrsl().
c
subroutine dqrqty(x, n, k, qraux, y, ny, qty)
integer n, k, ny
double precision x(n,k), qraux(k), y(n,ny), qty(n,ny)
integer info, j
double precision dummy(1)
do 10 j = 1,ny
call dqrsl(x, n, n, k, qraux, y(1,j), dummy, qty(1,j),
& dummy, dummy, dummy, 1000, info)
10 continue
return
end
c
subroutine dqrqy(x, n, k, qraux, y, ny, qy)
integer n, k, ny
double precision x(n,k), qraux(k), y(n,ny), qy(n,ny)
integer info, j
double precision dummy(1)
do 10 j = 1,ny
call dqrsl(x, n, n, k, qraux, y(1,j), qy(1,j),
& dummy, dummy, dummy, dummy, 10000, info)
10 continue
return
end
c
subroutine dqrcf(x, n, k, qraux, y, ny, b, info)
integer n, k, ny, info
double precision x(n,k), qraux(k), y(n,ny), b(k,ny)
integer j
double precision dummy(1)
do 10 j = 1,ny
call dqrsl(x, n, n, k, qraux, y(1,j), dummy,
& y(1,j), b(1,j), dummy, dummy, 100, info)
10 continue
return
end
c
subroutine dqrrsd(x, n, k, qraux, y, ny, rsd)
integer n, k, ny
double precision x(n,k), qraux(k), y(n,ny), rsd(n,ny)
integer info, j
double precision dummy(1)
do 10 j = 1,ny
call dqrsl(x, n, n, k, qraux, y(1,j), dummy,
& y(1,j), dummy, rsd(1,j), dummy, 10, info)
10 continue
return
end
c
subroutine dqrxb(x, n, k, qraux, y, ny, xb)
integer n, k, ny
double precision x(n,k), qraux(k), y(n,ny), xb(n,ny)
integer info, j
double precision dummy(1)
do 10 j = 1,ny
call dqrsl(x, n, n, k, qraux, y(1,j), dummy,
& y(1,j), dummy, dummy, xb(1,j), 1, info)
10 continue
return
end
Prof Ripley emailed me 2019-06-01 saying fix the LTO warnings. He linked to a readme.txt:
So I ctrl+f
[-Wlto-type-mismatch]
and find the following 2 and only 2 instances: