BinPro / CONCOCT

Clustering cONtigs with COverage and ComposiTion
Other
120 stars 48 forks source link

issues about concoct #186

Closed Shirley-Q closed 4 years ago

Shirley-Q commented 6 years ago

I do this command:concoct -c 40 --coverage_file concoct-input/concoct_inputtableR.tsv --composition_file contigs/velvet_71_c10K.fa -b concoct-output/ Then I have something error: screenshot from 2018-03-16 19_28_08 And I can not deal with it,I need you help,thanks!

alneberg commented 6 years ago

Hi @qiaoxuejiao,

most likely, something is wrong with your coverage file. Check that it looks correct and ensure that it has tab separated columns. If you find no error, I will have to look into this error more closely, preferably using input files where you can replicate the problem.

Best Wishes, Johannes

Shirley-Q commented 6 years ago

Hi @alneberg, I can not find error, Could you help me look into this error more closely, thanks!

sample1_L1_concoct_inputtableR.pdf

Best wishes, Xuejiao

alneberg commented 6 years ago

Woah! A thousand pages pdf! I think the problem could be that the contig ids are digits only. We've had this issue before, could you try adding for example "contig_" before all contig ids.

Hope this will work. Johannes

Shirley-Q commented 6 years ago

I adding "contig_" before all contig ids. But it does not work. I think I prepare a wrong inputfile, I do the whole process again. But I have another error. After I do this command: image And I have this error: error Thanks very much! Xuejiao

Shirley-Q commented 6 years ago

Hi, I want to know where are bins. This is my content of concoct-output file: error Thanks very much!

alneberg commented 6 years ago

Hi again @qiaoxuejiao,

the ClusterPlot.R issue is a know issue which is fixed in the latest online version of the script: https://github.com/BinPro/CONCOCT/blob/master/scripts/ClusterPlot.R

Your clustering results are available in the clustering_gt1000.csv file.

In order to get the clusters in fasta format you can use the script:

https://github.com/BinPro/CONCOCT/blob/master/scripts/extract_fasta_bins.py

But if you used the cut_up_fasta.py script you probably want to merge the clustering first, you can do that using: https://github.com/EnvGen/toolbox/blob/master/scripts/concoct/merge_cutup_clustering.py

Best Wishes, Johannes

Shirley-Q commented 6 years ago

Hi, When I python merge_cutup_clustering.py, it have something wrong. Are there something wrong about my usage? screenshot from 2018-03-23 12_56_28 Thank you very much!

alneberg commented 6 years ago

Yes you should run that script on the clustering file and not the original fasta file.

Johannes

Shirley-Q commented 6 years ago

Hi, I run the command using those two scripts. 1. screenshot from 2018-03-24 22_01_37 The following is result: screenshot from 2018-03-24 22_01_02 The results displayed on the terminals directly.Should I generate a new file?what should I do? 2. For the extract_fasta_bins.py script. I use file with velvet_71_c10K.fa and clustering_gt1000.csv,right? Thanks very much!

Shirley-Q commented 6 years ago

Hi @alneberg, I want to determine the most suitable number of genomes. What should I do?

alneberg commented 6 years ago

In most cases you're fine running with the default parameter '-c 400'. If you have both highly diverse samples AND very deeply sequenced samples, you could increase this number.