EBI-Metagenomics / EukCC

Tool to estimate genome quality of microbial eukaryotes
GNU General Public License v3.0
31 stars 9 forks source link

Support of compressed bin.fa #41

Open KateSakharova opened 1 year ago

KateSakharova commented 1 year ago

It would be very cool to support compressed bins in bindir folder. If I specify suffix fa.gz - it fails (I guess because it tries to open uncompressed file) eukcc folder --improve_percent 10 --n_combine 1 --threads 16 --improve_ratio 5 --links metabat2.links.csv --min_links 100 --suffix .fa.gz --db eukcc2_db_ver_1.1 --out metabat2_merged_bins --prefix "metabat2_merged." maxbin error

Command error:
  22-08-2023 13:35:49:  EukCC version 2.1.2
  22-08-2023 13:35:49:  Found 4 bins
  Traceback (most recent call last):
    File "/opt/conda/bin/eukcc", line 8, in <module>
      sys.exit(main())
    File "/opt/conda/lib/python3.8/site-packages/eukcc/__main__.py", line 497, in main
      eukcc_folder(args)
    File "/opt/conda/lib/python3.8/site-packages/eukcc/refine.py", line 65, in eukcc_folder
      refine(state)
    File "/opt/conda/lib/python3.8/site-packages/eukcc/refine.py", line 136, in refine
      state["contigs"] = merge_fasta(
    File "/opt/conda/lib/python3.8/site-packages/eukcc/fasta.py", line 48, in merge_fasta
      for seq in Fasta(fasta):
    File "/opt/conda/lib/python3.8/site-packages/eukcc/fasta.py", line 94, in Fasta
      raise ValueError("The provided fasta file is malformed: {}".format(path))
  ValueError: The provided fasta file is malformed: maxbin/MaxBin2.001.fa.gz