Closed YangQi007 closed 3 months ago
@YangQi007 I see that X and Y are still hard coded to 23 and 24 in two places. We should make that dynamic based on the format and ordering in the header.
The X and Y are converted to number 23 and 24 in these two places, because they're used for getting the readDepth data. The order for "X" and "Y" in the readDepth is #23
and #24
.
Does it always use 23 and 24, even if the file is missing a chromosome in the middle? For example, what if chr5 was completely gone, would 23 and 24 be there, or would it be 22 and 23?
Looking at the source code for baiReadDepth, it doesn't appear to be doing any special logic to handle X/Y:
https://github.com/chmille4/bamReadDepther/blob/master/bamReadDepther.cpp
I think it just goes in order. So we're going to need to use the header as our single source of truth. We need to follow it's numbering and ordering of chromosomes. Also we should start calling them "references" or "reference sequences", not "chromosomes", since they won't necessarily be chromosomes.
Looking at the way you're using
bamHeaderMap
, I think it's more complicated than necessary. I think all you need is an indexMap that converts ref names (ie "sn") to the index, then you can use the index directly onbamHeader
. So it would look something like this:indexMap = bamHeader.reduce((acc, ref, index) => { acc[ref.sn] = index; return acc; }, {}); const selectedChromosomeData = data[indexMap[chromosome]];
I think this will simplify your code. It helps keep bamHeader as the single source of truth
Here I keep the object and add the index into it because I need to get "sn" and "length" as well. E.g const selectedChromosomeData = data[bamHeaderMap[chromosome].index]; const chromosomeLength = bamHeaderMap[chromosome].length;
Here I keep the object and add the index into it because I need to get "sn" and "length" as well. E.g
const selectedChromosomeData = data[bamHeaderMap[chromosome].index]; const chromosomeLength = bamHeaderMap[chromosome].length;
Why can't you do:
const selectedChromosomeData = data[indexMap[chromosome]];
const chromosomeLength = bamHeader[indexMap[chromosome]].length;
1, Made conventional region format 2, Changed chromosome region from 0 indexed to 1 indexed