mdozmorov / genome_runner

Academic Free License v3.0
0 stars 3 forks source link

Handle .BED files that have header #1

Closed mdozmorov closed 10 years ago

mdozmorov commented 10 years ago

https://www.ncbi.nlm.nih.gov/projects/SNP/dbSNP.cgi?list=rslist adds header to the bed file, like “track name=…” Make GR more foolproof against such situations. Potential solution: check if the lines are starting with "chr", if not, ignore.

To look for: Sometimes, chromosome names are outputted without "chr" prefix, as numbers only. The aforementioned solution will result in ignoring all lines in the .BED file - one should catch it with a message "BED file problem. Check if you have tab-separated chrom, chromStart, chromEnd genomic coordinates, e.g. chrX tab 1234 tab 5678"

lkscara commented 10 years ago

Fixed in GRTK 'toBed' script

mdozmorov commented 10 years ago

egrep regular expression remains to be tested with other file converting cases