samtools / hts-specs

Specifications of SAM/BAM and related high-throughput sequencing file formats
http://samtools.github.io/hts-specs/
644 stars 174 forks source link

is `*` better than `\*`? #748

Closed zhitian-wu closed 10 months ago

zhitian-wu commented 10 months ago

I think the manual for SAM format is very clear. Only one trivial question.

In page 6 of the file SAMv1.pdf, you used \* to denote a single * sign. But it is also possible to omit the backslash, as * is the first character in the regular expression (mentioned in page 236 in the latest POSIX standard).

jmarshall commented 10 months ago

You are correct that if these REs were basic REs, then either * or \* could be used equivalently. Nonetheless I think \* is clearer — if we changed it to use *|[A-Za-z=.]+ I expect we would get bug reports asking whether we had typoed by omitting a character before the apparent 0-or-more * operator.

However §1 specifies that the SAM specification uses extended REs. For EREs, POSIX (in the current Issue 7) says an <asterisk> appearing first in the ERE produces undefined results. So in fact plain * would be incorrect as we use EREs.