NBISweden / pipelines-nextflow

A set of workflows written in Nextflow for Genome Annotation.
GNU General Public License v3.0
42 stars 18 forks source link

Installing and running pipelines questions #84

Closed Brent-Saylor-Canopy closed 1 year ago

Brent-Saylor-Canopy commented 1 year ago

Hi, I i'm trying to setup your pipelines. Specifically I'm interested in the Abinitio Training and the Functional Training workflows.

I have had no trouble setting up nextflow and cloning the git repository

I installed nextflow with conda as suggested in your guide.

I'm new to singularity and nextflow so I may be doing something wrong, but my understating is that singularity docker and mamba/conda are used in your scrip to manage environments. In my past experience with workflows like snakemake the workflow manager downloaded and installed the required packages if they were needed. Is this the case for your pipelines?

I tried running the Abinitio Training workflow with both singularity and conda, but I needed to install into the environment myself to get the pipeline to start, it then stopped again when it needed blast. Do I need to be setting up these environments, and that databases for Interproscan and blast in the Functional training myself, or am I doing something wrong to get nextflow and the environment managers to deal with this for me.

Thanks for your help,

I get the following error using the conda profile in a new conda environment using the launch command

nextflow run ../../pipelines-nextflow/ -profile singularity -params-file params.yml tee log_singularity.txt

and the following paramters file

subworkflow: 'abinitio_training'
genome: '/data/Maker_annotation/Inputs/Genome_A_soft_masked.fasta'
maker_evidence_gff: '/data/Maker_annotation/Genome_rnd1.maker.output/91K_rnd1.all.maker.gff'
maker_species_publishdir: '/data/miniconda3/envs/maker_env/config/species/'
species_label: 'Genome_NBIS'
codon_table: 1
outdir: '/data/Maker_annotation/NBISweden_Ab_initio_test/results'
Error executing process > 'ABINITIO_TRAINING:SPLIT_MAKER_EVIDENCE (91K_rnd1.all.maker)'

Caused by:
  Process `ABINITIO_TRAINING:SPLIT_MAKER_EVIDENCE (91K_rnd1.all.maker)` terminated with an error exit status (1)

Command executed:

  agat_sp_separate_by_record_type.pl \
      -g 91K_rnd1.all.maker.gff \
      -o maker_results_noAbinitio_clean
  if test -f maker_results_noAbinitio_clean/mrna.gff && test -f maker_results_noAbinitio_clean/transcript.gff; then
      agat_sp_merge_annotations.pl \
          --gff maker_results_noAbinitio_clean/mrna.gff \
          --gff maker_results_noAbinitio_clean/transcript.gff \
          --out merged_transcripts.gff
      mv merged_transcripts.gff maker_results_noAbinitio_clean/mrna.gff
  elif test -f maker_results_noAbinitio_clean/transcript.gff; then
      cp maker_results_noAbinitio_clean/transcript.gff maker_results_noAbinitio_clean/mrna.gff
  fi

  cat <<-END_VERSIONS > versions.yml
  "ABINITIO_TRAINING:SPLIT_MAKER_EVIDENCE":
      agat: 0.9.2
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  /bin/bash: .command.sh: No such file or directory

Work dir:
  /data/Maker_annotation/NBISweden_Ab_initio_test/work/c0/8e63a41a6026cdba481471321884e5

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
mahesh-panchal commented 1 year ago

In my past experience with workflows like snakemake the workflow manager downloaded and installed the required packages if they were needed. Is this the case for your pipelines?

Yes, Nextflow should automatically handle the software packaging depending on profile, using docker with the docker profile, etc.

I tried running the Abinitio Training workflow with both singularity and conda, but I needed to install into the environment myself to get the pipeline to start, it then stopped again when it needed blast. Do I need to be setting up these environments, and that databases for Interproscan and blast in the Functional training myself, or am I doing something wrong to get nextflow and the environment managers to deal with this for me.

Something is definitely odd here. You shouldn't need to any software setup except for perhaps with Interproscan.

The error here:

/bin/bash: .command.sh: No such file or directory

suggests that the filesystem is not correctly mounted. Take a look over some of the singularity configuration options. https://www.nextflow.io/docs/latest/config.html#scope-singularity. Try adding the following to a custom config to see if it fixes things.

singularity.autoMounts = true

I'm not sure if this will fix the software packaging issue. What version of singularity are you using?

Brent-Saylor-Canopy commented 1 year ago

You were right, it was a problem with my singularity installation. I reinstalled and added the line you suggested and it seems to be working. I haven't tried out the Functional Annotation pipeline yet though. What setup is necessary for Interpro if I'm only using the non commercial software?

mahesh-panchal commented 1 year ago

What setup is necessary for Interpro if I'm only using the non commercial software?

To be honest, I don't really know. We've had an alternate installation that we were using before I swapped the code to use the bioconda module. Apparently no one here has tested it on real data since the incorporation so I have no feedback on what else needs to be done. When you go ahead with it, it would be great to hear what errors you encounter, or if it just works. If you encounter errors, please open a new issue and I'll try to help as best I can.