lawremi / rtracklayer

R interface to genome annotation files and the UCSC genome browser
Other
29 stars 17 forks source link

import.chain refuses chain file with spaces instead of tabs #23

Closed RoelKluin closed 1 year ago

RoelKluin commented 4 years ago

Chain files from here use spaces in the rows after the header; USCS chain files use tabs: ftp://ftp.ensembl.org/pub/assembly_mapping/homo_sapiens/

consequently import.chain does not work:

 ch = import.chain("GRCh37_to_GRCh38.chain")
Error in .local(con, format, text, ...) :
  expected 11 elements in header, got 1, on line 4

it would be nice if the import.chain method would work with spaces as well.

Until then converting spaces to tabs before the import works (in bash):

sed -r 's/^([0-9]+) ([0-9]+) ([0-9]+)$/\1\t\2\t\3/' GRCh37_to_GRCh38.chain > GRCh37_to_GRCh38_tabs.chain
pre-mRNA commented 2 years ago

Very helpful, thanks. It would be great if this functionality could be added directly to rtracklayer.