nf-core / sarek

Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
https://nf-co.re/sarek
MIT License
388 stars 401 forks source link

too many genotypes in the combined VCF record #1560

Open fan040 opened 3 months ago

fan040 commented 3 months ago

Description of the bug

hi,i find

process > NFCORE_SAREK:SAREK:BAM_VARIANT_CALLING_GERMLINE_ALL:BAM_JOINT_CALLING_GERMLINE_GATK:GATK4_GENOTYPEGVCFS (joint_variant_calling)" going wrong.
I checked the log file , 
Sample/Callset D_D900002( TileDB row idx 0) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900005( TileDB row idx 1) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900006( TileDB row idx 2) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900035( TileDB row idx 3) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900036( TileDB row idx 4) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900038( TileDB row idx 5) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900039( TileDB row idx 6) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900049( TileDB row idx 7) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900053( TileDB row idx 8) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900054( TileDB row idx 9) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles, 
ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added         for this sample for this location.

Sample/Callset D_D900055( TileDB row idx 10) at Chromosome chr4D_part1 position 10150573 (TileDB column 7817502166) has too many genotypes in the combined VCF record : 1081 : current limit : 1024 (num_alleles,
 ploidy) = (46, 2). Fields, such as  PL, with length equal to the number of genotypes will NOT be added        for this sample for this location.

How can I solve this problem ,looking forward to a reply,thank you!

Command used and terminal output

nextflow run /cluster/home/fanrong/biosofts/nextflow/nf-core/nf-core-sarek/3_4_0 -profile zwnj_2022 -offline --input /public/home/fanrong/projects/AK58/03_nfcore/samplesheet.csv --outdir ./result --step mapping --fasta /public/home/fanrong/projects/AK58/fasta/wheat_AK58v4MP.genome_part.fa --fasta_fai /public/home/fanrong/projects/AK58/fasta/wheat_AK58v4MP.genome_part.fa.fai --dict /public/home/fanrong/projects/AK58/fasta/wheat_AK58v4MP.genome_part.dict --trim_fastq --aligner bwa-mem2 --max_cpus 16 --max_memory 100.GB --max_time 720.h --tools haplotypecaller --bwamem2 /cluster/home/fanrong/work_lei/BSA/01_ZG/02_mapping_3/work/33/706371da5349b7b033e07f32ae9d58/bwamem2 --igenomes_ignore --genome null --skip_tools baserecalibrator,haplotypecaller_filter --joint_germline --split_fastq 0

Relevant files

No response

System information

No response

FriederikeHanssen commented 3 weeks ago

Hi! Apologies for the delayed reply. I am not sure what the sarek related issue is. Is this a tiledb limitiation or is there something wrong with the vcf file?