MesserLab / SLiM

SLiM is a genetically explicit forward simulation software package for population genetics and evolutionary biology. It is highly flexible, with a built-in scripting language, and has a cross-platform graphical modeling environment called SLiMgui.
https://messerlab.org/slim/
GNU General Public License v3.0
160 stars 30 forks source link

Add contig header to VCF #426

Closed currocam closed 7 months ago

currocam commented 7 months ago

Hi!

Thanks for this amazing software! I have a question related to VCF files. Is there any reason not include the contig header in the VCF file? According to the VCF spec, "it is highly recommended (but not required) that the header include tags describing the contigs referred to in the VCF file".

For example, I found it necessary when filtering the VCF file using bcftools. I made a simple example: model_vcf.txt

slim model_vcf.txt
 # creates results.vcf 
bcftools view --samples i1,i2 < results.vcf

The previous chunk of code gives me the error:

#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  i1      i2
[W::vcf_parse] Contig '1' is not defined in the header. (Quick workaround: index the file with tabix.)
Undefined tags in the header, cannot proceed in the sample subset mode.

As far as I understand from the source code, the CHROM column is fixed to be 1).

Would not the PR I made do the trick (although maybe it is a bit pretentious to say that a single line is a PR)?I built it locally (using the code snipped from the TO_DO file) and do some minimal testing:

mkdir build
cd build
cmake ..
make
./slim model_vcf.txt # creates results.vcf 
bcftools view --samples i1,i2 < results.vcf

I'm not sure if it is a breaking change, but the tests from the command line still works. I tried:

./slim -testEidos
SUCCESS count: 6885

and

./slim -testEidos
SUCCESS count: 36435

Please, let me know if I misunderstood something.

Best, Curro

bhaller commented 7 months ago

I don't know that much about VCF, but it seems reasonable to me. Thanks!