EBI-Metagenomics / emg-viral-pipeline

VIRify: detection of phages and eukaryotic viruses from metagenomic and metatranscriptomic assemblies
Apache License 2.0
127 stars 16 forks source link

Update PPR-Meta install #69

Closed s-meaden closed 2 years ago

s-meaden commented 2 years ago

Hi,

I'm trying to install and run the pipeline using Singularity on an HPC. The install fails during the PRR-Meta installation process giving me the following error:

Error executing process > 'download_pprmeta:pprmetaGet'

Caused by:
  Process `download_pprmeta:pprmetaGet` terminated with an error exit status (128)

Command executed:

  git clone https://github.com/Stormrider935/PPR-Meta.git
  mv PPR-Meta/* .  
  rm -r PPR-Meta

Command exit status:
  128

Command output:
  Cloning into 'PPR-Meta'...

Command error:
  fatal: unable to look up current user in the passwd file: no such user
  Unexpected end of command stream

A quick look at github shows that the current git repository for PRR-Meta is https://github.com/mult1fractal/PPR-Meta not Stormrider935.

Are there any possible workarounds you would suggest? Thanks a lot for the pipeline!

hoelzer commented 2 years ago

Hey @s-meaden thanks for your interest in the pipeline!

And thanks for reporting this, we are mainly using the pipeline with Docker so might have missed this issue. I will have a look asap but might need some time due to travel.

And btw, the original repo of PPR-Meta is https://github.com/zhenchengfang/PPR-Meta but if I recall correctly we forked that to add some changes.

s-meaden commented 2 years ago

Hi @hoelzer thanks for checking! No rush- I'll try and get Docker up and running on our HPC in the meantime.

hoelzer commented 2 years ago

Hey @s-meaden

I changed the git URL to initially pull PPR-Meta. However, it should have also worked before... Anyway, can you please give it another try using

nextflow pull EBI-Metagenomics/emg-viral-pipeline
nextflow run EBI-Metagenomics/emg-viral-pipeline -r pprmeta-repo-fix <your parameters>

Thanks! If it does not work please report the used command here as well. Can you in general pull code from github? So does

git clone https://github.com/mult1fractal/PPR-Meta.git

work?

hoelzer commented 2 years ago

Oh, I'm sorry. I think I forgot to push the actual changes to that branch -.- Now it's possible to test like described above @s-meaden

s-meaden commented 2 years ago

Thanks @hoelzer, I don't get that error any more.

I'm still having issues installing docker on our server and the singularity engine gives the following error:

Error executing process > 'preprocess:rename (1)'

Caused by:
  Failed to pull singularity image
  command: singularity pull  --name mhoelzer-python3_virify-0.1.img docker://mhoelzer/python3_virify:0.1 > /dev/null
  status : 255

Command I'm using: nextflow run EBI-Metagenomics/emg-viral-pipeline -r v0.2.0 --fasta "ERZ2271866_FASTA.fasta.gz" --cores 2 -profile local,singularity

I guess asking singularity to pull a docker image doesn't work- perhaps @mberacochea has a workaround?

hoelzer commented 2 years ago

Hi @s-meaden ok great! I think then we can also merge #75 @mberacochea

Yeah docker on a server will usually not work (for good reasons) so the way to go is singularity.

And luckily docker containers can be easily converted into singularity (and nextflow can do that).

Can you pull the image manually? To check if your firewall and singularity install works:

singularity pull  --name mhoelzer-python3_virify-0.1.img docker://mhoelzer/python3_virify:0.1

And you could also switch to the latest release v0.4.0 btw. Then the containers are not at my private registry anymore but maintained by the EBI folk : )

Last tipp, run the pipeline the first time with

--max_cores 1 --cores 1

beccause sometimes on a cluster or systems with many cores there are issues in pulling and converting singularity images in parallel. When you have all the images locally, you can use more cores.

Are you actually on a High performance Cluster or a working station? local should be fine on a working station that does not has a job scheduler but on a HPC w/ e.g. SLURM or LSF you should also switch to that profile.

s-meaden commented 2 years ago

Thanks for all the tips @hoelzer!

I can pull the image now but get an error running it. I'm pretty sure this is a local issue as I get the same error pulling and running alpine.sif as a test.

ERROR : Failed to create user namespace: user namespace disabled

I'll park this until I've figured out the issue with our HPC and singularity, but appreciate the help.

As a side note, I got the following error with v0.4.0 but not with v0.2.0:

No such variable: enable

 -- Check script '/nobackup/beegfs/home/ISAD/sm758/.nextflow/assets/EBI-Metagenomics/emg-viral-pipeline/virify.nf' at line: 2 or see '.nextflow.log' file for more details
hoelzer commented 2 years ago

Yeah that sounds like a general issue with your hpc and singularity configuration.

Regarding the error with the newer version: weird, here we just enable the new DSL v2 language for nextflow. My only idea: your nextflow version is too old.

mberacochea commented 2 years ago

Hey, we have merged https://github.com/EBI-Metagenomics/emg-viral-pipeline/pull/75

Can you try the "master" branch to see if that solves the PPR-Meta issue?

s-meaden commented 2 years ago

Hey @mberacochea

I tried the following but got a few errors (although not linked to PPR-Meta). I'm not sure I can really test anything due to the local issues I have with Singularity. But in case this is any help:

nextflow pull EBI-Metagenomics/emg-viral-pipeline

nextflow run EBI-Metagenomics/emg-viral-pipeline -r v0.4.0 --help 

Throws an error 'No such variable: enable'

nextflow run EBI-Metagenomics/emg-viral-pipeline -r v0.2.0 --fasta "TESTFILE.fasta.gz" --cores 1 -profile slurm,singularity

gets stuck at the following step '[c1/a60c01] process > preprocess:rename [ 0%] 0 of 1' I left it 5 hours but it was hanging at this step.

hoelzer commented 2 years ago

Hey @s-meaden !

What's your nextflow -version ? The enable variable might not be available in older nextflow versions. Also can you please try to run from the master branch?

nextflow run EBI-Metagenomics/emg-viral-pipeline -r master --help 

Can you check the content of the work dir folder ls -lah work/c1/a60c01*/ ? To see if the process that hang did write any files. If there is a .command.log or .command.err then you could check that as well.

s-meaden commented 2 years ago

Hi @hoelzer !

Updating my nextflow solves that issue and the following command runs fine:

nextflow run EBI-Metagenomics/emg-viral-pipeline -r master --help

When I run the pipeline on a test file I get the same error as before ("root filesystem extraction failed: extract command failed: ERROR : Failed to create user namespace: user namespace disabled").

nextflow run EBI-Metagenomics/emg-viral-pipeline -r master --fasta "ERZ2271866_FASTA.fasta.gz" --cores 1 -profile slurm,singularity

As above, this is probably a local issue with our hpc. I'm waiting for access to an alternative where I can try and run it all in a Virtual Machine. Thanks!

hoelzer commented 2 years ago

Yes, that sounds, unfortunately, like a configuration issue on your machine. I hope it runs then in the VM!

(btw I think we can not work w/ a fasta.gz file, better use a .fasta as input)

I will close this issue bc/ the PPR-Meta install problem seems to be solved.