Closed asreimer closed 6 years ago
@asreimer - Thanks for the detailed description of the issues you've found. Do you think you could provide a list of RST installation dependencies for the version of Fedora you're using (similar to the readme or .travis.yml
files) to help us with the debugging? That way we could also add them to the readme to help future users on Fedora systems.
Just to be clear on your second bug, in the rst/lib
directory you're seeing files like libaacgm.x.x.a
and libaacgm.x.x.so
? Or were you simply using x.x
as generic placeholders? The screenshot below is a partial view of the lib directory on an Ubuntu installation of RST:
Are the version.h
files storing the correct MAJOR_VERSION
and MINOR_VERSION
values for each library/binary? Or do those also say X
and X
?
Of course now that I look at my own screenshot, something screwy happened with the libfitacf...
files.
@egthomas is your screenshot from after running with a fresh clone of the git repo? rst/lib
and rst/include
are not cleaned up by make.code
before building. Additionally, existing version.h
are not cleaned up either. Can you try deleting/renaming a version.h
for one of the libs and see if you still get the correct libname.x.x.so
that you expect?
Here's a screenshot of what I get:
As you can see, I was not using x.x as placeholders. I literally get nothing but x.x for version numbers... This is because the version.h
files are all storing X
and X
for the major and minor numbers. The version.h
files are not being generated correctly.
I see you have both astalg1 and astalg2 in the list of files you posted. I'm curious to see if you delete rst/lib
and then run make.code
if they both show up or not. I suspect they will not. This will confirm whether or not this issue is Fedora specific or not. I might beat you to is by spinning up an Ubuntu VM...
I don't have a fresh install of Fedora running right now so I don't have a full list of dependencies. I was working on my machine where I also have RSTLite installed, so most of the dependencies were already present. All I have is:
1) ncurses-devel
2) netcdf-devel
3) and of course the latest copy of the cdf library installed in /usr/local/cdf
I can work on getting a full list, but that's very low priority for me right now with the SuperDARN workshop coming up.
Ah, I wasn't clear earlier: I'm using the master branch.
edit: develop yields the same results on Fedora.
I did some more reading this morning. Case sensitivity of header files is pretty much a Linux wide phenomenon due to the fact that most Linux OSes use case sensitive file systems... So this should happen on Ubuntu too.
But it doesn't. I just tried installing rst on my Ubuntu 14.04 machine and it correctly figures out the version numbers for libraries. It even correctly determines that astalg.2.0
is the most recent astalg and only tries to compile that one.
So there's something different about Fedora 26 and Ubuntu 14.04 (@egthomas was it 16.04 you tried?). I'm pretty sure that something different is central to the makeall.c
code.
Maybe this is related to a change in gcc
?
This works with gcc 4.8.4-2ubuntu1~14.04.3
, but not with gcc 7.2.1 20170915 (Red Hat 7.2.1-2)
.
What gcc
are you using @egthomas
@asreimer I'm using Ubuntu 16.04 and gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
.
On a fresh install of the master
branch, I changed the name of the rprmidl.1.6
library to rprmidl.y.z
and found the following (expected) library files in rst/lib
: librprmidl.y.z.a
, librprmidl.y.z.so
, librprmidl.1.a
, and librprmidl.1.so
(the latter two of which are just symbolic links to the first two files).
Can you try deleting lines 65-82 of get.version? That portion of the code shouldn't be used anymore, but at the moment that's the only place I'm finding the x.x
version numbers being set as a default.
Ok, this is hilarious:
So it is failing because the get.version
from my RSTLite installation is being used. I guess I didn't clean up all of the environment variables that I thought I did...
You're also sure you don't have any relics from RSTLite in your environment variables / settings / paths, right?
edit: just saw your comment!
It's interesting that we both converged on the exact same realization at the almost the same time.
Indeed! So does that clear up your compilation issues? Or are some problems still remaining?
About the installation dependencies, that's nothing urgent, just something that would be useful to include in the readme at some point.
So this is interesting. Despite removing all references to RSTLite
in my .bashrc
and .bash_profile
(contrary to opinion, I'm not a complete idiot, I did this long before trying to install rst
) and yet RSTLite
is still on my PATH
...
It is safe to say that this is not a bug with rst
nor does it have anything to do with Fedora vs. Ubuntu.
Thanks @egthomas for responding. Your comments helped me figure out what was going on.
When I get some time, I'll get the Fedora dependencies sorted and make a pull request for the README.md file.
edit: I'm using the master branch. edit2: Same problem on both master and develop branches. edit3: See comment: https://github.com/SuperDARN/rst/issues/155#issuecomment-388658524 and onward.
Summary
I have found 2 bugs that affect compiling rst on systems that care about case sensitivity in header file names: 1) A very old bug in
makeall.c
that prevents the code from finding the latest version of a library for some directory structures. This resulted in the header for astalg.1.5 to be copied intorst/include
and for astalg.1.5 to be compiled instead of astalg.2.0... Since Fedora 26 is case sensitive on header files names and astalg.1.5 header isAstAlg.h
compared to 2.0 which hasastalg.h
, a compilation error was thrown. 2) A bug introduced by @sshepherd when they tried to remove.rst
directories for passing version numbers to the build system. Something else, beyond the changes toget.version
, seem to need to be done because every library inrst/lib
has a x.x version number...These bugs are sort of intertwined, so I've reported them together. I can make separate issues for each one, but at this point, I don't know the exact cause of both of them and need some help debugging.
You likely won't be able to reproduce the first bug on your own system if you don't start from a fresh clone of rst. You might not even be able to reproduce this outside of Fedora. You should be able to reproduce bug 2 though.
Symptoms
On Fedora 26, the
make.code
installation step seems to fail due to case sensitivity for header file names in therst/codebase/analysis/src.lib/igrf_v2/igrf.1.0/src/igrflib.c
file. I say seems here for reasons that will be explained.Here's the error message I received when the compilation fails:
Case sensitivity is dependent on system configuration (see: https://stackoverflow.com/questions/1951951/when-including-header-files-is-the-path-case-sensitive/37228442#37228442).
If you just stopped here, you would think that the issue is just that we need to rename the
rst/include/analysis/AstAlg.h
file toastalg.h
or something like that. But of course the wholerst/include
directories are generated bymake.code
(specifically, when it callsmake.hdr
), so this begs the question, "why didmake.code
not find the right astalg header?"We must go deeper
There's actually no meaningful difference between the astalg.1.5 and astalg.2.0 header files except for some spaces... Here's the diff of the two:
So this might explain why no one else has encountered the installation bug until now. The
AstAlg.h
is from version 1.5 of the astalg library andastalg.h
is from version 2.0, but there is no meaningful different between the files so a case-insensitive compile won't throw an error.Potential Cause
I think the problem here is a very old bug in the
makeall.c
code. I found that by creating an astalg_v2 directory and putting the astalg.2.0 directory into it, themake.code
(which callsmakeall
) was able to find it and compile it just fine. Digging in to this, I think it's a long-standing bug inmakeall.c
. Specifically something to do with howmakeall.c
detects libraries and determines the latest version to compile. For example, the first few lines ofmakeall -vb -t hdr ~/rst/codebase/analysis/src.lib
:You can see that it incorrectly finds that astalg.1.5 is the more recent version of astalg, when 2.0 > 1.5...
Next you can try running
makeall -a -vb -t hdr ~/rst/codebase/analysis/src.lib
and you'll see something like:which is funny. It's detecting astalg.1.5 and astalg.2.0 but returning 1.5 twice... Thanks
makeall
. Great job.Proposed Solutions
From what I can see, there are 3 ways to solve the problem with
AstAlg.h
vsastalg.h
A) Figure out what the underlying issue in the build system is where it cannot properly find both astalg.1.5 and astalg.2.0... I think this amounts to fixingmakeall
.B) Leave the bug in
makeall
and instead just remove astalg1.5 entirely. I did a simple grep searching forAstAlg.h
and didn't find any code expect astalg1.5 that requires it. This doesn't mean there isn't software that links to the libastalg1 library though. Is there? Can we just use astalg.2.0?C) Leave the bug in
makeall
and instead do the same thing for astalg as was done for mlt, aacgm, igrf, etc, specifically: 1) make a rst/codebase/analysis/src.lib/astalg_v2 2) move rst/codebase/analysis/src.lib/astalg/astalg.2.0 into rst/codebase/analysis/src.lib/astalg_v2Just to get rst working for me for now, I've implemented C) locally. I would vote for option A) as a proper fix though.
Another bug I found along the way...
After implementing solution C), as I said above, I encountered another problem.
@sshepherd made a change to the
rst/build/scripts/get.version
file in this commit. The intent of this commit was to remove the need for the.rst/version
files that the make scripts all use to determine the version of the library that is being make. However, the changes made don't go far enough because the make file only makes a library with version 'x.x', unless I add the.rst/version
file with contentsversion.2.0
in it into therst/codebase/analysis/src.lib/astalg_v2/
directory.If I don't add the
.rst/version
file, then I get the following compile error: