Open lbeltrame opened 5 years ago
Check constraints typically arise when the width of the data being inserted exceeds the maximum width of the column defined for the database. My guess would be you have a chromosome label that is longer than 10 characters, but could you check against the max widths of the columns, as defined here: https://github.com/quinlan-lab/vcf2db/blob/master/vcf2db.py#L628-L665?
vcf2db insert many records at a time. when there's an error, it tries to insert 1 at a time so the user gets some context on the exact record that casued a problem. It seems that something about the transaction is not working and so when it tries to insert 1 at a time (after supposedly failing the batch insert) it violates the unique constraint. Can you share a small vcf that recreates this problem?
I don't know what is the exact cause that generates it, but I might be able to generate (and share privately) a VCF that at least here exhibits the issue. Would that be possible? Where should I send it to?
@arq5x I don't think so: it's a regular hg19-based VCF, so I doubt I have that large chromosome labels.
send to bpederse@gmail.com
Sent. Make sure you double check the Spam folder, unfortunately my institution seems to have landed on a few blacklists.
I'm hitting this repeatedly when trying to load VCFs that have been merged via
bcftools merge
then normalized and decomposed viavt
and then annotated viavcfanno
. Note that each of these had been annotated with VEP prior to merging.It's absolutely unclear what causes it.
The only other thing of note was that the PED was "generated" (I don't need these information) like this: