RVanDamme / MUFFIN

hybrid assembly and differential binning workflow for metagenomics, transcriptomics and pathway analysis
https://rvandamme.github.io/MUFFIN_Documentation/#introduction
GNU General Public License v3.0
65 stars 11 forks source link

sourmash_checkm_parser TypeError: 'newline' is an invalid keyword argument for this function #22

Closed ivelsko closed 2 years ago

ivelsko commented 2 years ago

Hi, I'd like to try out MUFFIN on my data, but when I tried to run the test it gave me an error.

I ran

nextflow run RVanDamme/MUFFIN --output results_dir  --cpus 8 --memory 32g -profile local,conda,test
N E X T F L O W  ~  version 20.10.0
Pulling RVanDamme/MUFFIN ...
downloaded from https://github.com/RVanDamme/MUFFIN.git
Launching `RVanDamme/MUFFIN` [gigantic_agnesi] - revision: 3695f30cc3 [master]
executor >  local (12)
executor >  local (28)
[70/a665de] process > test                       [100%] 1 of 1 ✔
[05/c52247] process > sourmash_download_db       [100%] 1 of 1 ✔
[38/5ae572] process > checkm_download_db         [100%] 1 of 1 ✔
[76/c61ba1] process > checkm_setup_db            [100%] 1 of 1 ✔
executor >  local (28)
[70/a665de] process > test                       [100%] 1 of 1 ✔
[05/c52247] process > sourmash_download_db       [100%] 1 of 1 ✔
[38/5ae572] process > checkm_download_db         [100%] 1 of 1 ✔
[76/c61ba1] process > checkm_setup_db            [100%] 1 of 1 ✔
[e3/362d8c] process > discard_short (1)          [100%] 1 of 1 ✔
[87/03d454] process > merge (1)                  [100%] 1 of 1 ✔
[87/6a780f] process > fastp                      [100%] 1 of 1 ✔
[d5/7871f7] process > spades (1)                 [100%] 1 of 1 ✔
[69/57f860] process > minimap2 (1)               [100%] 1 of 1 ✔
[d4/e19ca6] process > bwa (1)                    [100%] 1 of 1 ✔
[2c/024b0b] process > metabat2 (1)               [100%] 1 of 1 ✔                                                                                                                                                                              [03/886e9c] process > maxbin2 (1)                [100%] 1 of 1 ✔
[6f/ee911a] process > concoct (1)                [100%] 1 of 1 ✔
[4d/296dfb] process > refine3 (1)                [100%] 1 of 1 ✔
[1f/1cbbcf] process > checkm (1)                 [100%] 1 of 1 ✔                                                                                                                                                                              [ea/a128c6] process > sourmash_bins (2)          [100%] 5 of 5 ✔
[ad/0973bf] process > sourmash_checkm_parser (1) [100%] 1 of 1, failed: 1 ✘                                                                                                                                                                   [70/9640e3] process > eggnog_download_db         [100%] 1 of 1 ✔
[f4/99484c] process > eggnog_bin (1)             [100%] 2 of 2
[-        ] process > parser_bin                 -
[e3/0d07ee] process > readme_output              [100%] 1 of 1 ✔

And then I got this error:

Error executing process > 'sourmash_checkm_parser (1)'

Caused by:
  Process `sourmash_checkm_parser (1)` terminated with an error exit status (1)

Command executed:

  grep -v "] INFO: " summary.txt | grep -v "\-\-\-\-\-\-\-" | grep -v "Bin Id" | sed -e 's/^[ \t]*//'|sed 's/[ \t]*$//' |sed -r 's/ +/,/g'|sed '/^$/d' >checkm.csv
  for file in $(ls bin*.txt); do tail -n 1 $file | sed -e 's/.fa//' >>sourmash.csv; done
  checkm_sourmash_parser.py -c checkm.csv -s sourmash.csv

Command exit status:
  1

Command output:
  (empty)

Command error:
  Traceback (most recent call last):
    File "/home/irina_marie_velsko/.nextflow/assets/RVanDamme/MUFFIN/bin/checkm_sourmash_parser.py", line 93, in <module>
      main()
    File "/home/irina_marie_velsko/.nextflow/assets/RVanDamme/MUFFIN/bin/checkm_sourmash_parser.py", line 85, in main
      parse(args)
    File "/home/irina_marie_velsko/.nextflow/assets/RVanDamme/MUFFIN/bin/checkm_sourmash_parser.py", line 54, in parse
      out_writing(dict_checkm_sourmash)
    File "/home/irina_marie_velsko/.nextflow/assets/RVanDamme/MUFFIN/bin/checkm_sourmash_parser.py", line 31, in out_writing
      with open("classify_step_summary.csv", mode='w', newline='') as out_file:
  TypeError: 'newline' is an invalid keyword argument for this function

Work dir:
  /mnt/archgen/microbiome_calculus/cmc_deep_seq/04-analysis/assembly_muffin/work/ad/0973bffca7dc36879db236f46cad4f

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

Everything else the run should have produced seems to be there, but it's missing classify_step_summary.csv. I tried it again with -resume in case there was a hiccup somewhere the 1st time, but it gave me the same error. Can you tell me what's going on, and if there's a way to fix this?

replikation commented 2 years ago

hi,

sorry for the late reply but i cant reproduce the results:

replikation@replikation-CELSIUS-R970:~/Desktop/tmp/muffin_tests$ nextflow run ~/gits/MUFFIN/main.nf --output results_dir  --cpus 24 --memory 50g -profile local,docker,test -resume
N E X T F L O W  ~  version 21.04.0
Launching `/home/replikation/gits/MUFFIN/main.nf` [gigantic_wilson] - revision: 9aa0a392da
executor >  local (17)
[c0/2e1e11] process > test                       [100%] 1 of 1, cached: 1 ✔
[skipped  ] process > sourmash_download_db       [100%] 1 of 1, stored: 1 ✔
[8e/5306d7] process > discard_short (1)          [100%] 1 of 1, cached: 1 ✔
[74/3fba6b] process > merge (1)                  [100%] 1 of 1, cached: 1 ✔
[e3/3b88e2] process > fastp                      [100%] 1 of 1, cached: 1 ✔
[77/e597d1] process > spades (1)                 [100%] 1 of 1, cached: 1 ✔
[65/e016f8] process > minimap2 (1)               [100%] 1 of 1, cached: 1 ✔
[6f/f4f28c] process > bwa (1)                    [100%] 1 of 1, cached: 1 ✔
[37/b447f1] process > metabat2 (1)               [100%] 1 of 1 ✔
[b0/d53933] process > maxbin2 (1)                [100%] 1 of 1 ✔
[64/4741e3] process > concoct (1)                [100%] 1 of 1 ✔
[74/441cb3] process > refine3 (1)                [100%] 1 of 1 ✔
[3f/ec2739] process > checkm (1)                 [100%] 1 of 1 ✔
[0a/638850] process > sourmash_bins (2)          [100%] 5 of 5 ✔
[b7/b63789] process > sourmash_checkm_parser (1) [100%] 1 of 1 ✔
[30/a6ca4c] process > eggnog_download_db         [100%] 1 of 1, cached: 1 ✔
[97/aedc8d] process > eggnog_bin (1)             [100%] 5 of 5 ✔
[14/ede7d7] process > parser_bin (1)             [100%] 1 of 1 ✔
[c6/51be40] process > readme_output              [100%] 1 of 1, cached: 1 ✔

Done! Results are stored here --> results_dir 
 The Readme file in results_dir describe the structure of the results directories. 

Completed at: 16-Feb-2022 08:40:06
Duration    : 17h 26m 18s
CPU hours   : 121.8 (3% cached)
Succeeded   : 17
Cached      : 9