JuliaData / YAML.jl

Parse yer YAMLs
Other
125 stars 43 forks source link

Scanner for YAML 1.2 #187

Open Paalon opened 2 weeks ago

Paalon commented 2 weeks ago

YAML.jl v0.4.10 scanner is specific to YAML v1.1. Therefore we'll need the following modules to support YAML v1.2 #186:

# YAML_1_1.jl

# A module for YAML 1.1.
# https://yaml.org/spec/1.1/
module YAML_1_1

# [22] b-line-feed ::= #xA /*LF*/
const b_line_feed = '\xa'

# [23] b-carriage-return ::= #xD /*CR*/
const b_carriage_return = '\xd'

# [24] b-next-line ::= #x85 /*NEL*/
const b_next_line = '\u85'

# [25] b-line-separator ::= #x2028 /*LS*/
const b_line_separator = '\u2028'

# [26] b-paragraph-separator ::= #x2029 /*PS*/
const b_paragraph_separator = '\u2028'

# [27] b-char ::=   b-line-feed | b-carriage-return | b-next-line
#                 | b-line-separator | b-paragraph-separator
is_b_char(c::Char) =
    c == b_line_feed || c == b_carriage_return | c == b_next_line ||
    c == b_line_separator || c == b_paragraph_separator

end # module YAML_1_1
# YAML_1_2.jl

# A module for YAML 1.2.
# https://yaml.org/spec/1.2.2/
module YAML_1_2

# [24] b-line-feed ::= x0A
const b_line_feed = '\x0a'

# [25] b-carriage-return ::= x0D
const b_carriage_return = '\x0d'

# [26] b-char ::= b-line-feed | b-carriage-return
is_b_char(c::Char) = c == b_line_feed || c == b_carriage_return

end # module YAML_1_2
# scanner.jl

include("YAML_1_1.jl")
using .YAML_1_1
include("YAML_1_2.jl")
using .YAML_1_2

# when processing YAML 1.1
if YAML_1_1.is_b_char(c)
    # hogehoge
end

# when processing YAML 1.2
if YAML_1_2.is_b_char(c)
    # fugafuga
end
Paalon commented 1 week ago

@GunnarFarneback @kescobo If we use versions traits overall the scanner functions, the ultimate implementation will be

function scan_something(version::YAMLVersion, stream::TokenStream)
    subprocess1(version, stream)
    subprocess2(version, stream)
end

Is this good at performance compared to naive ones (the above implementation)?

GunnarFarneback commented 1 week ago

Yes, there's basically no overhead related to the traits. Not that I think the overhead would have been significant with any other approach either.

Paalon commented 1 week ago

My maybe undisputed PR list of waiting to be merged (does not include new features about schemata):