marbl / metAMOS

A metagenomic and isolate assembly and analysis pipeline built with AMOS
http://marbl.github.io/metAMOS
Other
93 stars 45 forks source link

packaging and installation in a multi user environment #100

Open caseydunham opened 10 years ago

caseydunham commented 10 years ago

I have been working on getting metAMOS 1.3 installed on a linux server on our super computer.

I am running into many problems getting this up and running due to the current way that metAMOS is looking for dependencies.

For instance, I want to install a single instance of metAMOS into /opt/metAMOS and place the binaries on our users path.

However, when running initPipeline and runPipeline this way I run into problems with missing python dependencies.

For instance

[user@metamos ~]$ runPipeline [options] ImportError: No module named check_install

I am willing to help refactor and build packages for facilitating multi user installations such as this from a system operations perspective, but I am not too familiar with the use cases of the tool.

Ideally, this could be built into a deb or rpm package.

skoren commented 10 years ago

metAMOS should work in a multi-user environment with the metAMOS directory added to the path. Both runPipeline/initPipeline look for dependencies relative to their location and the check_install.py package must be in the same directory as runPipeline. My guess is this file is missing from the /opt/metAMOS directory or lacks permissions. metAMOS will install other missing python dependencies relative to its location as well. If a user is missing a package the installing account had, metAMOS would not install the dependency and the user's run would fail.

I would recommend running python INSTALL.py within the final directory you would like to run metAMOS from (i.e /opt/metAMOS) and to make sure the installer's environment and available python/perl packages is the same as the users.

There are some tools metAMOS relies on that may require write access to the Utilities/DB directory so you may need to adjust permissions for that directory.

caseydunham commented 10 years ago

Maybe this is a 1.3 issue with the frozen binary, but when I retrieved that (from http://www.cbcb.umd.edu/confcour/temp/metAMOS_binary.tar.gz), it did not have the INSTALL.py script, check_install scripts or any of the dependencies package. I may wrongly assumed that the binary package was built in a way that this was unnecessary.

I can definitely trouble shoot this more later this week. It may be more of a documentation issue than anything else.

I am working on a fully updated CentOS 6.4 installation.

skoren commented 10 years ago

I did not realize you were using the frozen binary. That file should be self-contained and extracts required dependencies into a temporary working directory before running. It should still work with read-only permissions for the user. Is it possible that the system is out of space or the user does not have permissions to create temporary folders?

caseydunham commented 10 years ago

Sorry, I thought I mentioned the original frozen binary issue.

Where does the frozen binary attempt to create the temporary directories?

The machine currently has 100GB of disk space. None of the users have sudo access to this box (and won't). If I need to create a new group for this that's fine. I can work that out.

If instead of using the frozen binary, I just checked out the relevant source, what would the best way be to correctly install that in this kind of environment? I was running into some of the same issues with that and putting it in for instance /opt/metAMOS.

skoren commented 10 years ago

The temporary directory used is /tmp and the files created have the prefix of _MEI. They should be cleaned up after a run. It is possible if they are not being cleaned up that the /tmp space was filled up so you can check for _MEI directories and remove them.

I've updated the code to not import the check_install in the common run cases so if you check out the latest code it would not have that dependency. If you are installing it in /opt/metAMOS, I would suggest checking out the code from git directly into /opt/metAMOS and run python INSTALL.py directly from there. It is best if you run installation on whatever system you plan to run metAMOS on and with the same environment. metAMOS will check your path for perl/python packages and programs so if a user runs with a different environment that is missing some packages which were available when INSTALL.py was run, metAMOS would fail. Do you have details on what error you had when you installed it in /opt/metAMOS?

treangen commented 10 years ago

hi Casey,

sorry for jumping in late here to help, my responses below (which mostly echo Serge's recent email):

Where does the frozen binary attempt to create the temporary directories?

this should be in most cases /tmp, but more specifically (from http://docs.python.org/2/library/tempfile.html):

If tempdir is unset or None at any call to any of the above functions, Python searches a standard list of directories and sets tempdir to the first one which the calling user can create files in. The list is:

The directory named by the TMPDIR environment variable. The directory named by the TEMP environment variable. The directory named by the TMP environment variable. A platform-specific location:

On RiscOS, the directory named by the Wimp$ScrapDir environment variable. On Windows, the directories C:\TEMP, C:\TMP, \TEMP, and \TMP, in that order. On all other platforms, the directories /tmp, /var/tmp, and /usr/tmp, in that order.

As a last resort, the current working directory.

The extracted folders all have a prefix of _MEI, but should be cleaned up upon completion of the run.

The machine currently has 100GB of disk space.

for a multi-user environment that is definitely going to be tight, especially since MetAMOS saves all intermediary output in addition to final output in /Postprocess/out. I'll add to our list the need to create a mode for running in conservative disk usage mode, which would judiciously cleanup all used space while running and only keep files contained in Postprocess/out. Currently we prefer to keep all intermediary output so the users have full access to all output generated by the run.

None of the users have sudo access to this box (and won't). If I need to create a new group for this that's fine. I can work that out.

that should be fine, just ensure the initPipeline & runPipeline binaries are in their path.

If instead of using the frozen binary, I just checked out the relevant source, what would the best way be to correctly install that in this kind of environment?

so long as you are running python INSTALL.py from the desired install directory it should be good to go, but please see the README for more details on dependencies etc.

I was running into some of the same issues with that >and putting it in for instance /opt/metAMOS.

hmm, could you please be a bit more specific? what kind of issues/errors did you encounter? did you run install in a temp dir then move to /opt/metAMOS ?

best,

Todd

On Thu, Dec 26, 2013 at 9:52 AM, Casey Dunham notifications@github.com wrote:

Sorry, I thought I mentioned the original frozen binary issue.

Where does the frozen binary attempt to create the temporary directories?

The machine currently has 100GB of disk space. None of the users have sudo access to this box (and won't). If I need to create a new group for this that's fine. I can work that out.

If instead of using the frozen binary, I just checked out the relevant source, what would the best way be to correctly install that in this kind of environment? I was running into some of the same issues with that and putting it in for instance /opt/metAMOS.

— Reply to this email directly or view it on GitHub.

Todd J. Treangen, Ph.D. Sr. Bioinformatics Scientist Battelle National Biodefense Institute (BNBI/NBACC) 110 Thomas Johnson Dr Frederick, MD 21702