statisticalbiotechnology / quandenser

QUANtification by Distillation for ENhanced Signals with Error Regulation
Apache License 2.0
9 stars 1 forks source link

Running on HPC as local user #9

Open andrewjmc opened 4 years ago

andrewjmc commented 4 years ago

You'll get tired of my issues soon! I need to run on HPC (CentOS) because of the large numbers of files.

However, I cannot install the rpm without root access. I tried some trickery to extract it from the rpm but it is clearly hardwired to look for things in the /usr directory.

I have instead tried to compile, following your advice for a local installation. I hit snags in the compilation of dinosaur which seem to be due to dinosaur and other se.lth.immun dependencies referencing an out-of-data scala repository (scala-tools.org, see https://stackoverflow.com/questions/41754260/issues-with-scala-tools-org-and-goodstuff-im). Although I can edit this in Dinosaur's pom.xml directly, it is less feasible to change it in the dependencies which are pulled from a repository.

I have lodged the issue with Dinosaur directly (https://github.com/fickludd/dinosaur/issues/12) but I find it peculiar as this should have been an issue for your own compilation.

Part of me therefore wonders whether I'm doing something wrong in the local compilation.

Is it possible to: (1) Plug in a ready-compiled dinosaur jar (https://github.com/fickludd/dinosaur/releases) to the quandenser source and compile without compiling dinosaur? or (2) Make releases suitable for extraction into a local directory without installation through yum on centOS? or (3) Release through anaconda?

Thanks again!

Andrew

MatthewThe commented 4 years ago

No worries, the more feedback we get the better!

(1) that should be possible, just make sure to set the CMake variable JAR_PATH to the correct directory. (2) I know there is an option in CMake to make an RPM package relocatable, but I think we did not manage to get it working for another software package we are maintaining. I can give this a try though and share the RPM with you for you to try. (3) I have not tried this before, but it is something worth considering. Would a docker image also be an option on your HPC environment?

percolator commented 4 years ago

Hi Andrew,

Have you tried to run quandenser through a container? There is one implementation of quandenser for singularity (which seems to be okay with most HPC clusters) available here: https://github.com/statisticalbiotechnology/quandenser-pipeline

Thanks --Lukas

On Wed, Nov 20, 2019 at 5:34 PM andrewjmc notifications@github.com wrote:

You'll get tired of my issues soon! I need to run on HPC (CentOS) because of the large numbers of files.

However, I cannot install the rpm without root access. I tried some trickery to extract it from the rpm but it is clearly hardwired to look for things in the /usr directory.

I have instead tried to compile, following your advice for a local installation. I hit snags in the compilation of dinosaur which seem to be due to dinosaur and other se.lth.immun dependencies referencing an out-of-data scala repository (scala-tools.org, see https://stackoverflow.com/questions/41754260/issues-with-scala-tools-org-and-goodstuff-im). Although I can edit this in Dinosaur's pom.xml directly, it is less feasible to change it in the dependencies which are pulled from a repository.

I have lodged the issue with Dinosaur directly (fickludd/dinosaur#12 https://github.com/fickludd/dinosaur/issues/12) but I find it peculiar as this should have been an issue for your own compilation.

Part of me therefore wonders whether I'm doing something wrong in the local compilation.

Is it possible to: (1) Plug in a ready-compiled dinosaur jar ( https://github.com/fickludd/dinosaur/releases) to the quandenser source and compile without compiling dinosaur? or (2) Make releases suitable for extraction into a local directory without installation through yum on centOS? or (3) Release through anaconda?

Thanks again!

Andrew

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/statisticalbiotechnology/quandenser/issues/9?email_source=notifications&email_token=AAAXKAD63S3K6R2ALOD3OYLQUVRJHA5CNFSM4JPVYXP2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4H2Y5OLQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXKABQRUT62M7WF2VJRYTQUVRJHANCNFSM4JPVYXPQ .

andrewjmc commented 4 years ago

Thanks @MatthewThe and @percolator - I generally have to use Singularity rather than Docker. I will try first @percolator's suggestion of the singularity container. Failing that I will try option (1). I'd be very happy to test a relocatable RPM.

Many other packages can be downloaded as simple tar.gz which run wherever they are unpacked. Would this be possible for quandenser?

Best wishes,

Andrew

andrewjmc commented 4 years ago

Well... hit an immediate issue with quandenser-pipeline due to /tmp directory (https://github.com/statisticalbiotechnology/quandenser-pipeline/issues/25)

I'm trying number 1 above, but don't really know how to go about it, and the more I look in the commands, it appears the build script is only going to result in an RPM, which I can't install!

I'll dig through the quandenser-pipeline code and see if I can find out where to change the /tmp directory. In the meantime, if you can produce a relocatable RPM, or a runnable tarball, I'll gladly try!

Thanks,

Andrew

percolator commented 4 years ago

That looks strange. You seem not to be able to write to your /tmp directory. What happens if you issue a "mkdir /tmp/hello" command. Do you get an Permission denied error? Is this on the centos cluster? --Lukas

On Thu, Nov 21, 2019 at 1:20 PM andrewjmc notifications@github.com wrote:

Well... hit an immediate issue with quandenser-pipeline due to /tmp directory (statisticalbiotechnology/quandenser-pipeline#25 https://github.com/statisticalbiotechnology/quandenser-pipeline/issues/25 )

I'm trying number 1 above, but don't really know how to go about it, and the more I look in the commands, it appears the build script is only going to result in an RPM, which I can't install!

I'll dig through the quandenser-pipeline code and see if I can find out where to change the /tmp directory. In the meantime, if you can produce a relocatable RPM, or a runnable tarball, I'll gladly try!

Thanks,

Andrew

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/statisticalbiotechnology/quandenser/issues/9?email_source=notifications&email_token=AAAXKAGWANSBIZFRVZDHJ2TQUZ4HTA5CNFSM4JPVYXP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE2BCTI#issuecomment-557060429, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXKAAZ5FUNW3IVI3OXQC3QUZ4HTANCNFSM4JPVYXPQ .

percolator commented 4 years ago

It seems like you can configure singularity to use a different tmp directory by setting the environment variable, SINGULARITY_TMPDIR, to whatever location your HPC cluster want you to use. https://singularity.lbl.gov/build-environment#temporary-folders

On Thu, Nov 21, 2019 at 1:55 PM Lukas Käll lukas.kall@scilifelab.se wrote:

That looks strange. You seem not to be able to write to your /tmp directory. What happens if you issue a "mkdir /tmp/hello" command. Do you get an Permission denied error? Is this on the centos cluster? --Lukas

On Thu, Nov 21, 2019 at 1:20 PM andrewjmc notifications@github.com wrote:

Well... hit an immediate issue with quandenser-pipeline due to /tmp directory (statisticalbiotechnology/quandenser-pipeline#25 https://github.com/statisticalbiotechnology/quandenser-pipeline/issues/25 )

I'm trying number 1 above, but don't really know how to go about it, and the more I look in the commands, it appears the build script is only going to result in an RPM, which I can't install!

I'll dig through the quandenser-pipeline code and see if I can find out where to change the /tmp directory. In the meantime, if you can produce a relocatable RPM, or a runnable tarball, I'll gladly try!

Thanks,

Andrew

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/statisticalbiotechnology/quandenser/issues/9?email_source=notifications&email_token=AAAXKAGWANSBIZFRVZDHJ2TQUZ4HTA5CNFSM4JPVYXP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE2BCTI#issuecomment-557060429, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXKAAZ5FUNW3IVI3OXQC3QUZ4HTANCNFSM4JPVYXPQ .

andrewjmc commented 4 years ago

Yes, this is on CentOS cluster. I get permission denied error as /tmp does not exist. The TMPDIR on the login nodes is elswhere (/rds/general/ephemeral/user/username/ephemeral/).

I tried changing the environment variable but this made no difference, and I note that in https://github.com/statisticalbiotechnology/quandenser-pipeline/blob/master/build_image.sh there is the line:

sudo SINGULARITY_TMPDIR=/tmp singularity build SingulQuand.SIF Singularity

This suggests to me that the built image might be hardcoded to use /tmp. But I don't understand singularity well enough to know. Although, perhaps that just means the /tmp was used on the machine used to build the image.

MatthewThe commented 4 years ago

I tried creating a relocatable package: quandenser-v0-02-linux-amd64.zip

You should then be able to install it as:

rpm --prefix=<some_directory> quandenser-v0-02-linux-amd64.rpm

Let me know if it works, otherwise I can indeed try to just make a tar.gz archive.

andrewjmc commented 4 years ago

Unfortunately as I'm not root I cannot use yum or rpm to install. I can only use rpm2cpio and cpio to extract the contents. However, wherever I extract, quandenser still looks for dependencies like dinosaur in /usr/share. If I grep the binary, I find it hard-coded internally.

I'd be very grateful to try a .tar.gz -- and I think this might be accessible to many who use HPC.

percolator commented 4 years ago

What happens if you edit the file build_image and set the variable SINGULARITY_TMPDIR to your preferred location?

--Lukas

On Thu, Nov 21, 2019 at 2:16 PM andrewjmc notifications@github.com wrote:

Yes, this is on CentOS cluster. I get permission denied error as /tmp does not exist. The TMPDIR on the login nodes is elswhere (/rds/general/ephemeral/user/username/ephemeral/).

I tried changing the environment variable but this made no difference, and I note that in https://github.com/statisticalbiotechnology/quandenser-pipeline/blob/master/build_image.sh there is the line:

sudo SINGULARITY_TMPDIR=/tmp singularity build SingulQuand.SIF Singularity

This suggests to me that the built image might be hardcoded to use /tmp. But I don't understand singularity well enough to know.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/statisticalbiotechnology/quandenser/issues/9?email_source=notifications&email_token=AAAXKAFWFKHU2XSYE5CQCM3QU2C33A5CNFSM4JPVYXP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEE2F4TA#issuecomment-557080140, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXKAGJF532XGPPOOWP3ITQU2C33ANCNFSM4JPVYXPQ .

andrewjmc commented 4 years ago

The build image file is not part of the download, I think it is used by singularity hub to build the image which Quandenser_pipeline.sh downloads. And I can't build singularity containers on HPC, as you need root access for that.

TimothyOlsson commented 4 years ago

The build image file is not part of the download, I think it is used by singularity hub to build the image which Quandenser_pipeline.sh downloads. And I can't build singularity containers on HPC, as you need root access for that.

Hi Andrew,

I think I know what is causing the problem and should be fixable. This is also an answer to the issue on the quandenser-pipeline repo

First of all, the problematic part that is causing the crashes comes from here, which executes this file, where I hard coded the /tmp folder. In short, the reason for this addition is to incorporate MSconvert into the pipeline and allow for multi-users Wine, which is not what it was initially designed for. More about the file can be found in this link, at the section "link_wine.sh".

Now, the hard-coded /tmp directory is not coupled to the SINGULARITY_TMPDIR variable, but it does not necessarily have to be that specific folder. However, it needs to be established during the container initialization, aka some global variables in the environment needs to be set to make everything work. For every container creation, we need to create the files, we don’t really know what the user wants to execute and it will be a unique path each time due to the containers being mounted on “loop devices” in the UNIX systems. The reason for using the /tmp directory is that it the files in themselves are only usable during the execution and are useless after the finished process. The files themselves are very small and after thousands of runs, the directory is only a couple of mb. You get the error since /tmp in your CentOS cluster does not exist, which apparently crashes the image during the initialization. What you can do to make it work is any of the following: A) Clone the repo, comment out this line and build the image on you local computer. This should disable using MSconvert, but the rest should work fine. B) Change this line from: do ln -sf -T $WINEPREFIX/$object /tmp/wineprefix64$USER/wineprefix64$1/$object; to do ln -sf -T $WINEPREFIX/$object $SINGULARITYTMPDIR/wineprefix64$USER/wineprefix64$1/$object; Change this line from: WINEPREFIX="/tmp/wineprefix64$USER/wineprefix64_$random_name" to WINEPREFIX="$SINGULARITYTMPDIR/wineprefix64$USER/wineprefix64_$random_name" then rebuild the image on your local computer.

So pretty much, its only changing a few lines of code but I do not know if changing the variable works, but it should in theory. I can experiment on it to bind this path to the SINGULARITY_TMPDIR variable and send the new container to you. Lastly, I do not know what causes this error line, but it should not affect the image initialization, since it I have observed it before, and it has not affected anything particular. mkdir: cannot create directory '/tmp/runtime-a3': Permission denied About running the pipeline, the easiest way would be to use the shell script in the repo. The script does a lot of things and makes a lot of manual work easier, so I recommend using the file instead. Connect to the HPC cluster and if you are running via ssh, you need to use the X server (ssh -X ….) if you are running on a UNIX machine (Mac, Linux, Debian etc). Run the shell script like this when you have the image in the same directory as the shell script: ./Quandenser_pipeline.sh /rds

This should mount the /rds directory, including the files you have in the user directory. I have included some other commands you can use if some things crash, such as “--disable-nvidia” if the cluster has a nvidia card, but your local computer does not, which can sometimes induce a crash. OpenGL can also be an issue, which you can disable with “--disable-opengl”.

I will hear back to you with the /tmp directory fix when I have tested it some more and perhaps can send you a prebuilt image with the fix.

EDIT1: I found the run time problem as well. The fixes seems easy enough and am currently building the new image.

EDIT2: It seems that there is some weird behavior with the environment variables, since when you execute the script, it imports as usual, but when the nextflow node does it, the environment variable is not used anymore. Continuing testing

TimothyOlsson commented 4 years ago

Hi again Andrew,

I took quite some time to make sure everything worked as it should, in all types of cases, but I have now released a new version of the pipeline, which is currently building at Singularity Hub. When you run the shell file (./Quandenser_pipeline.sh), it should automatically download the latest version + the new shell scripts. What you can do in this release is to set a custom environmental variable WINE_TMPDIR, which should be imported into the pipeline.

Good luck with the computing and if you have anymore questions or feedback, don't hesitate to contact us!

andrewjmc commented 4 years ago

Excellent, thanks for your determined work on this.

I've tried again but now have a very bizarre error (with every singularity call) which is probably one for the HPC team (messaging now):

FATAL: container creation failed: unable to parse singularity.conf file: open /etc/singularity/singularity.conf: no such file or directory

This is very strange because the config file definitely does exist! I assume the error is from singularity itself, not from "inside" the container. If the latter, is there an issue with not mapping the /etc/signularity directory?

Singularity version is 3.4.2-1.1.el7

TimothyOlsson commented 4 years ago

Yes, that seems to be a fault with singularity at the HPC center. If you were to run the container to print the version number aka "singularity run SingulQuand.SIF", it should display the version number of the image. If it crashes with the same error message, I would be almost certain be something up at the HPC cluster, perhaps some issues with permissions or something similar.

I forgot to mention it, but since you run on an HPC cluster, you need to enable the SLURM profile or else the pipeline will run on the log-in node. Its easy to do, just go to the "edit workflow" tab in the GUI when you get it to open and change the profile option from local to SLURM. Then you would add your project name in the small window which should say "-A \". The default values for the process cores + times should be good enough if you have a small-ish data set (10-20 files), but might need to be tweaked depending if you run quandenser in parallel (a feature which I added into the pipeline) or if you run larger data sets.

andrewjmc commented 4 years ago

Thanks. Our HPC uses PBS, so I assume the SLURM commands will not work.

I will probably need to follow your instructions for running without the pipeline: https://github.com/statisticalbiotechnology/quandenser-pipeline#running-the-pipeline-without-the-gui

I guess I can then set up commands to be run within PBS scripts.

TimothyOlsson commented 4 years ago

Oh, that should be pretty easy to fix if I'm not mistaken. The pipeline uses nextflow as a workflow manager and can use a lot of different workload managers, which you can read about here. I did not have any access to test on different HPC workload managers and only SLURM, which is what I added to the gui. I experimented a little with AWS, but not the other executors. However, in the configuration file, it would be as simple as changing the string "slurm" to "pbs".

I can add the other workload managers to the GUI as "experimental" features if it would help you, as well as the PBS option. It would be very easy to do. In that case, you could try running it on PBS and see if it works. That would be very helpful for us as well to check if we can run it on other workload managers.

andrewjmc commented 4 years ago

OK, great. Very happy to try!

TimothyOlsson commented 4 years ago

Fixed in the new release v0.0831 and added all the experimental workload managers, where PBS and PBSPRO are included. Also fixed some minor GUI bugs to make it easier to select what you want :)

andrewjmc commented 4 years ago

Great, thanks. As soon as HPC people can help me with singularity, I'll give it a whirl :-)

andrewjmc commented 4 years ago

Gradually working through the issues I've had running quandenser-pipeline (https://github.com/statisticalbiotechnology/quandenser-pipeline/issues/25).

In the meantime if you manage to make a .tar.gz archive I will gladly test.

Thanks again, and best wishes,

Andrew

MatthewThe commented 4 years ago

I looked into creating a tar.gz archive, but unfortunately it does not seem completely straightforward. We certainly should be able to do it, but I haven't found a method that would be user-friendly enough to release it.

Some notes (mainly for myself, though any input is welcome):

percolator commented 4 years ago

Would it make sense to use a hardcoded relative path, that one can override with a command line parameter, and/or an environment variable?

On Thu, Nov 28, 2019 at 12:34 PM MatthewThe notifications@github.com wrote:

I looked into creating a tar.gz archive, but unfortunately it does not seem completely straightforward. We certainly should be able to do it, but I haven't found a method that would be user-friendly enough to release it.

Some notes (mainly for myself, though any input is welcome):

  • The main problem is that the Quandenser executable has to find the Dinosaur jar file, regardless from which directory the command is run from.
    • One could require that the user always runs it from the Quandenser executable's directory, but that is just asking for trouble.
    • Apparently, hard-coding relative paths for dependencies in the executable is frowned upon, though not impossible.
  • A simple alternative could be to add a command line parameter where the path to the Dinosaur jar and parameter files could be specified, but this is not very user-friendly either.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/statisticalbiotechnology/quandenser/issues/9?email_source=notifications&email_token=AAAXKAAVGBROQ67NWWSLWR3QV6UC5A5CNFSM4JPVYXP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFMKNUI#issuecomment-559458001, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXKABTT7MGOYNNZ2RWVIDQV6UC5ANCNFSM4JPVYXPQ .

andrewjmc commented 4 years ago

If it can cope with updated versions of Dinosaur I'd say let the user obtain it and give it as a command line parameter.

If it needs to be a fixed version, another method I've seen (e.g. FragPipe GUI) is keep the jar as an asset inside the executable and extract it into $TMPDIR at runtime.

Any option, as long as clearly explained, should work fine.

Thanks!

Andrew On Thu, 28 Nov 2019, 17:29 Lukas Käll, notifications@github.com wrote:

Would it make sense to use a hardcoded relative path, that one can override with a command line parameter, and/or an environment variable?

On Thu, Nov 28, 2019 at 12:34 PM MatthewThe notifications@github.com wrote:

I looked into creating a tar.gz archive, but unfortunately it does not seem completely straightforward. We certainly should be able to do it, but I haven't found a method that would be user-friendly enough to release it.

Some notes (mainly for myself, though any input is welcome):

  • The main problem is that the Quandenser executable has to find the Dinosaur jar file, regardless from which directory the command is run from.
  • One could require that the user always runs it from the Quandenser executable's directory, but that is just asking for trouble.
  • Apparently, hard-coding relative paths for dependencies in the executable is frowned upon, though not impossible.
  • A simple alternative could be to add a command line parameter where the path to the Dinosaur jar and parameter files could be specified, but this is not very user-friendly either.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/statisticalbiotechnology/quandenser/issues/9?email_source=notifications&email_token=AAAXKAAVGBROQ67NWWSLWR3QV6UC5A5CNFSM4JPVYXP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFMKNUI#issuecomment-559458001 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAAXKABTT7MGOYNNZ2RWVIDQV6UC5ANCNFSM4JPVYXPQ

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/statisticalbiotechnology/quandenser/issues/9?email_source=notifications&email_token=ACYVC7LOTL74BDW5BMXGMI3QV75YXA5CNFSM4JPVYXP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFNFMYQ#issuecomment-559568482, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACYVC7NW2M4CD4QC4CRO4VDQV75YXANCNFSM4JPVYXPQ .