stschiff / msmc

Implementation of the multiple sequential markovian coalescent
GNU General Public License v3.0
87 stars 20 forks source link

msmc is too restrictive WRT chromosome IDs #7

Closed mhinsch closed 9 years ago

mhinsch commented 9 years ago

The regexp in line 118 in model/data.d restricts chromosome IDs to regular words (\w+). Many file formats, however, allow essentially any non-whitespace character in chromosome or scaffold IDs. A simple change to \S+ should do the trick.

stschiff commented 9 years ago

This is "solved", as I commented out the check altogether. I do enough other checks within the data read function already.