zanglab / SICER2

MIT License
20 stars 15 forks source link

grep chrom name in the first BED column only #17

Open darked89 opened 2 years ago

darked89 commented 2 years ago

Stricter grep so chromosome names without "chr" prefix can be used. Also a bit faster. Minimal code change. Tested

This is a stopgap solution since grepping through tens of millions rows for every chromosome can be replaced by i.e:

  1. requiring that the input BAM/BED files are sorted by position
  2. switching from grep to xsv
  3. indexing BED files with xsv
  4. retrieving the rows for a given chromosome, required columns only

Above makes sense if you want to introduce just the minimal code changes:

  1. only xsv will need to be installed
  2. BED file remains as the principal input