nf-core / pathogensurveillance

Surveillance of pathogens using population genomics and sequencing
https://nf-co.re/pathogensurveillance
MIT License
11 stars 5 forks source link

Error when some assembly metadata tsv files are empty and don't contain accessions to download #94

Open masudermann opened 1 week ago

masudermann commented 1 week ago

Description of the bug

I am doing some testing of complex_dataset_minimal, and everything starts well, but I'm encountering an error where the pipeline fails if some the tsv files containing the accessions to download for a given family (in my case) are empty. These files are linked from path_surveil_data/assembly_metadata.

Is there a way to just proceed and ignore these empty files?

Command used and terminal output

(nf-core) marthasudermann@pop-os:~/pathogensurveillance$ nextflow main.nf -profile complex_dataset_minimal,docker
N E X T F L O W  ~  version 23.10.1
Launching `main.nf` [peaceful_carson] DSL2 - revision: cc83aa0c27

------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/plantpathsurveil v1.0dev
------------------------------------------------------
Core Nextflow options
  runName                   : peaceful_carson
  containerEngine           : docker
  launchDir                 : /home/marthasudermann/pathogensurveillance
  workDir                   : /home/marthasudermann/pathogensurveillance/work
  projectDir                : /home/marthasudermann/pathogensurveillance
  userName                  : marthasudermann
  profile                   : complex_dataset_minimal,docker
  configFiles               : /home/marthasudermann/pathogensurveillance/nextflow.config

Input/output options
  sample_data               : test/data/metadata/complex_dataset_minimal.csv
  out_dir                   : test/output/complex_dataset_minimal
  download_bakta_db         : true
  cache_type                : lenient

Institutional config options
  config_profile_name       : Test dataset for Minimal complex dataset
  config_profile_description: Test dataset for Minimal complex dataset

Generic options
  trace_dir                 : null/pipeline_info

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use nf-core/plantpathsurveil for your analysis please cite:

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x
#####
ERROR ~ Error executing process > 'PATHOGENSURVEILLANCE:PREPARE_INPUT:PICK_ASSEMBLIES (SRR12888960)'

Caused by:
  Process `PATHOGENSURVEILLANCE:PREPARE_INPUT:PICK_ASSEMBLIES (SRR12888960)` terminated with an error exit status (1)

Command executed:

  pick_assemblies.R SRR12888960_families.txt SRR12888960_genera.txt SRR12888960_species.txt 30 20 10 SRR12888960.tsv Aleyrodidae.tsv Amborellaceae.tsv Aphididae.tsv Castoridae.tsv Chrysomelidae.tsv Cordycipitaceae.tsv Cricetidae.tsv Cucurbitaceae.tsv Dasypodidae.tsv Dasyuridae.tsv Fabaceae.tsv Fagaceae.tsv Formicidae.tsv Halomonadaceae.tsv Lepisosteidae.tsv Liviidae.tsv Macroscelididae.tsv Malvaceae.tsv Micrococcaceae.tsv Moraceae.tsv Nectriaceae.tsv Nitidulidae.tsv Otariidae.tsv Penaeidae.tsv Pentatomidae.tsv Phocidae.tsv Theaceae.tsv Theridiidae.tsv Tupaiidae.tsv Xanthomonadaceae.tsv

  cat <<-END_VERSIONS > versions.yml
  "PATHOGENSURVEILLANCE:PREPARE_INPUT:PICK_ASSEMBLIES":
      r-base: $(echo $(R --version 2>&1) | sed 's/^.*R version //; s/ .*$//')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  Error in `$<-.data.frame`(`*tmp*`, "family", value = "Dasypodidae") : 
    replacement has 1 row, data has 0
  Calls: lapply -> FUN -> $<- -> $<-.data.frame
  Execution halted

Work dir:
  /home/marthasudermann/pathogensurveillance/work/4c/daae218f1fb1874d272e6c13634773

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

Desktop-System 76 Linux computer