EMBL-PKU / BASALT

MIT License
76 stars 13 forks source link

Output Basalt #38

Closed vicru93 closed 1 month ago

vicru93 commented 1 month ago

Hello, thank you for this great tool, it is very efficient. I have a question:

I used this command line to run my job: "BASALT -a A26_MetaSensitiva_MEGAHIT.assembly.fa -s A26_FSFP210085323-1b_H337KDSX2_L4_1.fq.gz,A26_FSFP210085323-1b_H337KDSX2_L4_2.fq.gz -t 1 2 -m 130 -qc checkm"

for which I finally got a directory that looks like this: " ls A26_FSFP210085323-1b_H337KDSX2_L4_1.fq A26_FSFP210085323-1b_H337KDSX2_L4_1.fq.gz A26_FSFP210085323-1b_H337KDSX2_L4_2.fq A26_FSFP210085323-1b_H337KDSX2_L4_2.fq.gz A26_MetaSensitiva_MEGAHIT.assembly.fa Assembly_status.txt BASALT_command.txt BASALT_log.txt binning_A26_OUT-err.o222987 binning_A26_OUT-log.o222987 Binning_A26.sh Binsets_backup.tar.gz Coverage_depth_connection_SimilarBin_files_backup.tar.gz Final_bestbinset Group_Bestbinset.tar.gz Group_comparison_files.tar.gz Group_genomes.tar.gz idba.fa Not_mapped_reads.txt Potential_bins.txt Record_error_bin.txt Total_contig_eliminated_from_deep-refinment.txt"

I understand that I must work in the directory called "Final_bestbinset", this directory looks like this: "ls bin10_SPAdes_re-assembly_contigs.1_re.2_re.fa bin11.1_re.2_re.fa bin1.1_re.fa bin12.1_re.fa bin13_IDBA_re-assembly_contigs.1_re.2_re.fa bin14.1_re.2_re.fa bin15_SPAdes_re-assembly_contigs.1_re.2_re.fa bin16.1_re.2_re.fa bin17_SPAdes_re-assembly_contigs.1_re.fa bin18_SPAdes_re-assembly_contigs.1_re.2_re.fa bin19_SPAdes_re-assembly_contigs.1_re.2_re.fa bin20.1_re.2_re.fa bin21_IDBA_re-assembly_contigs.1_re.2_re.fa bin22_SPAdes_re-assembly_contigs.1_re.2_re.fa bin23.1_re.2_re.fa bin24_SPAdes_re-assembly_contigs.1_re.2_re.fa bin25.1_re.2_re.fa bin26_SPAdes_re-assembly_contigs.1_re.2_re.fa bin27_SPAdes_re-assembly_contigs.1_re.2_re.fa bin28.1_re.2_re.fa bin2_SPAdes_re-assembly_contigs.1_re.2_re.fa bin3_SPAdes_re-assembly_contigs.1_re.2_re.fa bin4_SPAdes_re-assembly_contigs.1_re.2_re.fa bin5.1_re.2_re.fa bin6_SPAdes_re-assembly_contigs.fa bin7_SPAdes_re-assembly_contigs.1_re.2_re.fa bin8_SPAdes_re-assembly_contigs.fa bin9.1_re.2_re.fa OLC_bin_stats_ext_o.tsv OLC_bin_stats_ext.tsv"

Now my question is... What should I do with this directory? Should I run checkM to verify bin quality? should i run this directory by gtdbk tool?

Best W.

EMBL-PKU commented 1 month ago

Hi, Yes, the folder of Final_binset containes those bins generated by BASALT. The file of 'OLC_bin_stats_ext.tsv' was generated by checkM, so you do not need to re-do checkm again. You could run this directory by gtdbtk, but please beware the you may need to change the suffix to those, e.g. fna, for fitting gtdbtk. And, we found several bugs in BASALT V1.0.1, and we just uploaded a new version of BASALT(1.0.2) to github. Please update the BASALT program. Thank you.

vicru93 commented 1 month ago

I am working on a high-performance cluster, so I should request the update of this tool with administrator rights. Is there a script to update the tool or should I request that they simply reinstall it?

Best W.