alekseyzimin / masurca

GNU General Public License v3.0
242 stars 35 forks source link

Flye is not installed when DEST is set to a different prefix from `pwd` #175

Open eburgueno opened 4 years ago

eburgueno commented 4 years ago

The install.sh script allows you to externally define the DEST variable to specify a different installation root path for masurca than wherever the source files were uncompressed:

#!/bin/sh

set -xe

ROOT=`pwd -P`
[ -z "$DEST" ] && DEST="$ROOT"

However, Flye is never installed in $DEST, it is left compiled under $ROOT. The example below is on a CentOS 7 container:

# yum -y install bzip2 bzip2-devel gcc gcc-c++ make perl-Env perl-devel which wget zlib zlib-devel
# cd /tmp
# wget https://github.com/alekseyzimin/masurca/releases/download/v3.4.1/MaSuRCA-3.4.1.tar.gz
# tar -xzf MaSuRCA-3.4.1.tar.gz
# cd MaSuRCA-3.4.1
# export BOOST_ROOT=install
# export DEST=/opt/MaSuRCA-3.4.1
# ./install.sh

# ls /opt/MaSuRCA-3.4.1/
CA8  bin  include  lib  libexec  share
# find /opt/MaSuRCA-3.4.1 -name flye
#
# pwd
/tmp/MaSuRCA-3.4.1
# find . -name flye
./Flye/bin/flye
./Flye/flye

Perhaps a simple fix would be to replace this line with something like:

grep Flye install.sh || echo "(cd Flye && make && cp -a ../Flye $DEST);" >> install.sh

This however will also copy the source code to the final install destination, which is unnecessary. A better fix would be to add an install target on the Flye Makefile, and then doing a make install.

alekseyzimin commented 3 years ago

It is better to install MaSuRCA at the destination without using the DEST variable. MaSuRCA can be installed anywhere and it does not require root access to install. You can even install a new version of MaSuRCA for each new genome assembly, this will ensure ability to re-run the assembly with additional data.

eburgueno commented 3 years ago

So are you saying that you prefer for each installation to also include the source code and other build-time files?

alekseyzimin commented 3 years ago

Sure, why not. These files do not take much space. And this way you will know exactly which version was used for the assembly.

eburgueno commented 3 years ago

It's not a matter of disk space (though in the context of containers, the smaller the better). Think about anyone wanting to package up your software for distribution in any other format (eg: rpm, deb, anaconda, etc). It's not only untidy, it's forces the package maintainer to hack around your install script. Besides, why provide the ability to specify a prefix at all if it's not going to be used?

In the particular case of Flye, the situation is made worse because not only you're providing your own patched version of it, you're also silently leaving it out the install prefix. Something a package maintainer will almost definitely miss, until an end user reports that Flye was not found.

If versioning is a concern for reproducibility, adopting a versioning scheme such as SemVer is a much better option.