MGXlab / CAT_pack

CAT/BAT/RAT: tools for taxonomic classification of contigs and metagenome-assembled genomes (MAGs) and for taxonomic profiling of metagenomes
MIT License
168 stars 32 forks source link

CAT: Database was built with a different version of Diamond and is incompatible. #119

Open ShangLiii opened 3 months ago

ShangLiii commented 3 months ago

Hi, I am using the already prepared CAT gtdb database (version: 20231120) and it is notified that the diamond version is not compatible. However, when I checked the log of the "20231120_CAT_gtd", it is said [2023-11-21 00:14:46] ERROR: DIAMOND database could not be created. I am wondering whether the file is currupted if not does anyone know what diamond version I should use?

thauptfeld commented 3 months ago

Hi,

Can you post the diamond version that you are using and the code that you are trying to run? That will help me understand better where things are going wrong.

Cheers, Tina

ShangLiii commented 2 months ago

Hi,

Can you post the diamond version that you are using and the code that you are trying to run? That will help me understand better where things are going wrong.

Cheers, Tina

Hi, I am using diamond (2.1.8, install through bioconda). My command is: python CAT_pack contigs -c $workdir/contigs_1000bp.fasta -d $Prjdir/CAT_db/gtdb/db -t $Prjdir/CAT_db/gtdb/tax --sensitive -o $catdir/${i} --force

Nethertheless, I have managed to run CAT by prepare customed database through 'CAT_pack download'. Thank you anyway!

By the way, I am now trying to run RAT using self-prepared nr database. However, the CAT failed with no error message. Here is the log: RAT is running. Mapping reads against assembly with bwa mem.

[2024-04-08 16:53:45] Running bwa mem for read mapping. File /data2/lishang/CAT/m2.Lma.12/m2.Lma.12.rat.contigs_1000bp.fasta.m2.Lma.12_R1.fixed.fq.00.0_0.cor.fastq.gz.bwamem.sorted will be generated.Do not forget to cite bwa mem and samtools when using RAT in your publication! [2024-04-08 16:53:45] Contigs fasta is already indexed. [2024-04-08 16:53:45] Running bwa mem... [M::bwa_idx_load_from_disk] read 0 ALT contigs [M::process] read 1288846 sequences (193326900 bp)... [M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 630995, 3970, 0) [M::mem_pestat] skip orientation FF as there are not enough pairs [M::mem_pestat] analyzing insert size distribution for orientation FR... [M::mem_pestat] (25, 50, 75) percentile: (241, 357, 502) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 1024) [M::mem_pestat] mean and std.dev: (384.48, 169.42) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1285) [M::mem_pestat] analyzing insert size distribution for orientation RF... [M::mem_pestat] (25, 50, 75) percentile: (663, 768, 892) [M::mem_pestat] low and high boundaries for computing mean and std.dev: (205, 1350) [M::mem_pestat] mean and std.dev: (751.45, 138.52) [M::mem_pestat] low and high boundaries for proper pairs: (1, 1579) [M::mem_pestat] skip orientation RR as there are not enough pairs [M::mem_pestat] skip orientation RF [M::mem_process_seqs] Processed 1288846 reads in 94.948 CPU sec, 1.620 real sec [main] Version: 0.7.17-r1188 [main] CMD: bwa mem -t 112 /data2/lishang/Contig/m2.Lma.12/contigs_1000bp.fasta /data2/lishang/Contig/m2.Lma.12/corrected/m2.Lma.12_R1.fixed.fq.00.0_0.cor.fastq.gz /data2/lishang/Contig/m2.Lma.12/corrected/m2.Lma.12_R2.fixed.fq.00.0_0.cor.fastq.gz [main] Real time: 5.615 sec; CPU: 98.723 sec [2024-04-08 16:53:51] Sorting bam file... [bam_sort_core] merging from 0 files and 112 in-memory blocks... [2024-04-08 16:53:52] Read mapping done!

[2024-04-08 16:53:52] No contig2classification file supplied. Running CAT on contigs. prodigal [2024-04-08 16:53:52] Running CAT. [2024-04-08 16:53:52] ERROR: CAT finished abnormally.

My command is: python CAT_pack reads --mode cr -c $workdir/contigs_1000bp.fasta -1 $workdir/corrected/${i}_R1.fixed..fastq.gz -2 $workdir/corrected/${i}_R2.fixed..fastq.gz -d $Prjdir/CAT_db/nr/db -t $Prjdir/CAT_db/nr/tax -o $catdir/${i}.rat --force

lxsteiner commented 1 month ago

Hi,

Can you post the diamond version that you are using and the code that you are trying to run? That will help me understand better where things are going wrong.

Cheers, Tina

Could you please post the version of the dependencies you used to generate the databases and which versions have been tested or work within CAT for the listed software?

Python 3, https://www.python.org/.
DIAMOND, https://github.com/bbuchfink/diamond.
Prodigal, https://github.com/hyattpd/Prodigal.
BWA, https://github.com/lh3/bwa.
SAMtools, http://www.htslib.org/download/.

Thank you.