iobio / iobio-charts

0 stars 1 forks source link

Updated the chromosome region format #98

Closed YangQi007 closed 3 months ago

YangQi007 commented 3 months ago

1, Made conventional region format 2, Changed chromosome region from 0 indexed to 1 indexed

YangQi007 commented 3 months ago

@YangQi007 I see that X and Y are still hard coded to 23 and 24 in two places. We should make that dynamic based on the format and ordering in the header.

The X and Y are converted to number 23 and 24 in these two places, because they're used for getting the readDepth data. The order for "X" and "Y" in the readDepth is #23 and #24.

anderspitman commented 3 months ago

Does it always use 23 and 24, even if the file is missing a chromosome in the middle? For example, what if chr5 was completely gone, would 23 and 24 be there, or would it be 22 and 23?

anderspitman commented 3 months ago

Looking at the source code for baiReadDepth, it doesn't appear to be doing any special logic to handle X/Y:

https://github.com/chmille4/bamReadDepther/blob/master/bamReadDepther.cpp

I think it just goes in order. So we're going to need to use the header as our single source of truth. We need to follow it's numbering and ordering of chromosomes. Also we should start calling them "references" or "reference sequences", not "chromosomes", since they won't necessarily be chromosomes.

YangQi007 commented 3 months ago

Looking at the way you're using bamHeaderMap, I think it's more complicated than necessary. I think all you need is an indexMap that converts ref names (ie "sn") to the index, then you can use the index directly on bamHeader. So it would look something like this:

indexMap = bamHeader.reduce((acc, ref, index) => {
  acc[ref.sn] = index;
  return acc;
}, {});

const selectedChromosomeData = data[indexMap[chromosome]];

I think this will simplify your code. It helps keep bamHeader as the single source of truth

Here I keep the object and add the index into it because I need to get "sn" and "length" as well. E.g const selectedChromosomeData = data[bamHeaderMap[chromosome].index]; const chromosomeLength = bamHeaderMap[chromosome].length;

anderspitman commented 3 months ago

Here I keep the object and add the index into it because I need to get "sn" and "length" as well. E.g const selectedChromosomeData = data[bamHeaderMap[chromosome].index]; const chromosomeLength = bamHeaderMap[chromosome].length;

Why can't you do:

const selectedChromosomeData = data[indexMap[chromosome]];
const chromosomeLength = bamHeader[indexMap[chromosome]].length;