Open snowformatics opened 1 year ago
Hi Stefanie,
it is not possible to access or get variants in such manner: callset['chr7B']['variants']
If you want to get data or metadata of variants for chromosome chr7B as a numpy-array you need to do it like this:
pos_index = allel.ChromPosIndex(callset['variants/CHROM'][:], callset['variants/POS'][:])
chrom_range = pos_index.locate_key('chr7B')
variants_reference_alleles = callset['variants/REF'][chrom_range]
variants_alternate_alleles = callset['variants/ALT'][chrom_range]
Out-of-memory access to portions of the calls could be efficiently done by using get_orthogonal_selection() method of the underlying Zarr library which loads only the needed slice of the zarr-array into a numpy-array:
calls_chr7B = callset['calldata/GT'].get_orthogonal_selection((chrom_range, slice(None), slice(None)))
Hope that solves your problem?
Patrick
Thanks a lot Patrick! Looks like the tutorial I found was outdated, I will give it a try 😊
Hi,
I am trying to read my genomic data and to follow the tutorial:
callset = zarr.open('test.zarr', mode='r') variants = callset['chr7B']['variants']
but I am getting this error:
raise KeyError(item) KeyError: 'chr7B'
When I print out a list of all chromosome, chr7B is included:
list(callset['variants/CHROM/']
Any ideas what I am doing wrong?
Thanks