vcf_first_vt = allel.VariantTable({'CHROM' : vcf_first['variants/CHROM'], 'POS' : vcf_first['variants/POS'], 'REF' : vcf_first['variants/REF'], 'ALT' : vcf_first['variants/ALT'][:,0], 'GT' : vcf_first['calldata/GT'][:,0,0], 'PS' : vcf_first['calldata/PS'][:,0]}, index=('CHROM','POS'))
File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 4517, in __init__
self.set_index(index)
File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 4542, in set_index
index = SortedMultiIndex(self[index[0]], self[index[1]],
File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 4036, in __init__
l1 = SortedIndex(l1, copy=copy)
File "/home/nbailey/anaconda3/lib/python3.9/site-packages/allel/model/ndarray.py", line 3384, in __init__
raise ValueError('values must be monotonically increasing')
ValueError: values must be monotonically increasing
For some clarity, my python script has the vcf_first_vt definition given above, and this causes the subsequent errors. It seems I can avoid this error so long as I use lexicographic sorting of numbers (e.g. chr1, chr10, chr2 instead of chr1, chr2, chr10) and remove chromosome names without numbers (e.g. chrX and chrY). This is odd to me as I assume something like "chr1" should be treated as a string (as per the example here: https://scikit-allel.readthedocs.io/en/stable/model/ndarray.html?highlight=sortedmultiindex#sortedmultiindex). I suppose the lexicographic sorting makes sense when the numbers are treated as strings, though I don't understand why they necessarily need to be sorted in any particular order at all. Is there a way of defining a VariantTable that I'm missing that would allow chromosomes to be sorted in any particular order? If not, would it be possible to make this kind of issue more explicit?
Hello,
I have been receiving the below error:
For some clarity, my python script has the vcf_first_vt definition given above, and this causes the subsequent errors. It seems I can avoid this error so long as I use lexicographic sorting of numbers (e.g. chr1, chr10, chr2 instead of chr1, chr2, chr10) and remove chromosome names without numbers (e.g. chrX and chrY). This is odd to me as I assume something like "chr1" should be treated as a string (as per the example here: https://scikit-allel.readthedocs.io/en/stable/model/ndarray.html?highlight=sortedmultiindex#sortedmultiindex). I suppose the lexicographic sorting makes sense when the numbers are treated as strings, though I don't understand why they necessarily need to be sorted in any particular order at all. Is there a way of defining a VariantTable that I'm missing that would allow chromosomes to be sorted in any particular order? If not, would it be possible to make this kind of issue more explicit?