metagentools / GraphBin2

☯️🧬 Refined and Overlapped Binning of Metagenomic Contigs Using Assembly Graphs
https://graphbin2.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
33 stars 3 forks source link

ValueError: invalid literal for int() with base 10 #12

Closed tpriest0 closed 2 months ago

tpriest0 commented 1 year ago

Hey,

I have been trying to run Graphbin2 on a MetaFlye assembly but can't seem to get a successful run.

Steps taken: 1) Assembly with metaflye 2) Use the 'gfa2fasta.py' script to create edges fasta file 3) Map reads using minimap2 4) Determine coverage using CoverM 5) Bin contigs using metabat

From the above steps, I am trying to run GraphBin2 with the following command:

graphbin2 --assembler Flye --graph assembly_graph.gfa --contigs assembly_edges.fasta --binned metabat_contig_assignments.txt --abundance contig_depth.txt --output graphbin2_output/

However, it appears that it has a problem reading the coverage information, with the following error appearing:

2023-01-05 15:38:22,105 - INFO - GraphBin2 started Traceback (most recent call last): File "/XXX/XXX/XXX/software/GraphBin2/src/graphbin2_Flye.py", line 115, in coverages[contig_num] = int(strings[1]) ValueError: invalid literal for int() with base 10: '8.622175\n'

Here is a snippet of the input files:

metabat_contig_assignments.txt -

edge_491,bin_10 edge_1666,bin_10 edge_1973,bin_10 edge_2082,bin_10 edge_2827,bin_10 edge_2862,bin_10 edge_2895,bin_10 edge_3022,bin_10 edge_3110,bin_10 edge_1029,bin_11

contig_depth.txt -

edge_1 8.622175 edge_2 13.167155 edge_3 12.42515 edge_4 13.927776 edge_5 5.494654 edge_6 25.540865 edge_7 33.814053 edge_8 0 edge_9 16.802715 edge_10 4.9655805

Any idea what the solution might be?

Thanks for your time

Vini2 commented 1 year ago

Hello @tpriest0,

Thank you for your interest in GraphBin2.

The line where this error is coming has been commented out according to the current version of the code. Can you get the latest version of GraphBin2 from GitHub and try running again?

Let me know if the error still pops up.

Thank you!

tpriest0 commented 1 year ago

Hey @Vini2

Thank you for the quick response.

I have cloned the git repository and recreated the conda environment.

Now I am receiving the following error:

2023-01-06 10:03:56,636 - INFO - GraphBin2 started Traceback (most recent call last): File "/XXX/XXX/XXX/software/GraphBin2/src/graphbin2_Flye.py", line 102, in with open(contig_paths, "r") as file: FileNotFoundError: [Errno 2] No such file or directory: 'None'

This suggests a problem with the contigs file or at least the opening of it. The contigs file was produced by the 'gfa2fasta.py' script.

Many thanks

Vini2 commented 1 year ago

Hello @tpriest0,

Can you share with me the exact command you used to run GraphBin2?

Thank you!

tpriest0 commented 1 year ago

Hey @Vini2

I am using the same graphbin2 command that I originally posted:

graphbin2 --assembler Flye --graph assembly_graph.gfa --contigs assembly_edges.fasta --binned metabat_contig_assignments.txt --abundance contig_depth.txt --output graphbin2_output/

Many thanks!

Vini2 commented 2 months ago

Hello @tpriest0,

GraphBin2 has been updated to handle the original Flye contigs. Feel free to give it a try.

Closing this issue.