I've been writing up my own parser for a little edification, and have a question about the line termination scheme.
Table 1 of [1] says that the line terminator is ?U+000D?, [?U+000A?] | ?U+000A?. By my understanding, [ ] represent something optional, | is an 'or', and , is concatenation.
AFAIK, this can be represented in a regex as (\r\n?)|\n.
I think that means that recognised line terminators are : \n, \r\n, and \r.
I've been writing up my own parser for a little edification, and have a question about the line termination scheme.
Table 1 of [1] says that the line terminator is
?U+000D?, [?U+000A?] | ?U+000A?
. By my understanding,[ ]
represent something optional,|
is an 'or', and,
is concatenation.AFAIK, this can be represented in a regex as
(\r\n?)|\n
.I think that means that recognised line terminators are :
\n
,\r\n
, and\r
.Is that correct?
[1] https://scripts.iucr.org/cgi-bin/paper?S1600576715021871