reneshbedre / bioinfokit

Bioinformatics data analysis and visualization toolkit
MIT License
340 stars 76 forks source link

Error with vcf file split #36

Open stanislavzlp opened 3 years ago

stanislavzlp commented 3 years ago

Hi VCF split can fall with an error: UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('int64'), dtype('<U4')) -> None

In file analys.py

455 sub_df = read_vcf_file_df[read_vcf_file_df[id]==chrom_ids[r]] 456 # out_vcf_file = open(chrom_ids[r]+'.vcf' 457 with open(chrom_ids[r]+'.vcf', 'w') as out_vcf_file: 458 for l in info_lines: 459 out_vcf_file.write(l+'\n')

I've split vcf file with chromosomes named: 1, 2, 3 etc., and found this error. Please check it.

I think that default str() type change will help to avoid this kind of problem in the future. I suggest to change lines 457 and 460 to this: 457 with open(str(chrom_ids[r])+'.vcf', 'w') as out_vcf_file: ... 460 sub_df.to_csv(str(chrom_ids[r])+'.vcf', mode='a', sep='\t', index=False)