Closed pmarflee closed 3 years ago
Hello and happy new year!
I looked at your code and at the problem's statement. Each passport declaration has to be divided by a blank line (or as you wrote, two consecutive new lines). Instead of adding a "\r\n\r\n"
at the end of the KeyValuePairs
' production, you can instead use the sepBy
operator to separate the passport declarations with a literal
of two new lines.
With some additional changes, I took the liberty of refactoring the attached snippet like this:
namespace AdventOfCode.Core
module Day4 =
open Farkle
open Farkle.Builder
open Farkle.Builder.Regex
type Field =
| BirthYear
| IssueYear
| ExpirationYear
| Height
| HairColor
| EyeColor
| PassportID
| CountryID
type KeyValuePair = { Key : Field; Value : string; }
type Passport = Passport of KeyValuePair list
type Parser () =
static let field =
// Writing the field type as a nonterminal makes the parser more readable and case-insensitive.
[
"byr", BirthYear
"iyr", IssueYear
"eyr", ExpirationYear
"hgt", Height
"hcl", HairColor
"ecl", EyeColor
"pid", PassportID
"cid", CountryID
]
|> List.map (fun (name, x) -> !& name =% x)
|> (||=) "Field"
static let value =
// The regex was shortened.
regexString "([a-z]|\d|#)+"
|> terminal "Value" (T(fun _ data -> data.ToString()))
static let keyValuePair = "KeyValuePair" ||= [
!@ field .>> ":" .>>. value => (fun k v -> { Key = k; Value = v })
]
static let passport = "Passport" ||= [
!@ (many1 keyValuePair) => Passport
]
// This designtime Farkle matches many different passports in the same
// input file (I assume you were calling the parser many times)
static let passports = sepBy (literal "\r\n\r\n") passport
// Declaring the runtime Farkle in a static member will cause it to be rebuilt every
// time the property is accesed, resulting in a significant waste of computation.
// I also took the liberty to remove the runtime Farkle for the KeyValuePair
// (unless I am mistaken it doesn't seem necessary).
static let runtime = RuntimeFarkle.build passports
static member internal parse input =
match RuntimeFarkle.parseString runtime input with
| Ok result -> result
| Error err -> failwith (err.ToString())
I tested this modified parser with the sample input and it worked. I believe it will work for the purposes of the problem.
However, in the general case, it is brittle and doesn't always work. Try adding three new lines between two passports and the parser will view them as one passport. Try adding two new lines at the end of the text and the parser will fail with a syntax error.
The correct way to parse new lines in Farkle is through the newline
operator. I tried to rewrite the parser using it but it failed with a Shift-Reduce conflict because Farkle has trouble distinguishing between the optional new lines between passport fields and the at least two new lines that separate passport declarations. But as I said before, it should work in your case.
To actually answer how to match EOF (even though it doesn't seem to be necessary in this case), it can be done using virtual terminals and by writing a custom tokenizer. This repository has a well-documented sample that uses virtual terminals and a custom tokenizer. A tokenizer that just matches EOF would be much simpler.
This feature is already implemented and is scheduled to be shipped with Farkle 6.0.0, which I hope to be released this month.
By the way @pmarflee, you are the first user to open an issue in Farkle's three and a half years of existence, making your code the first publicly documented use of Farkle. I am very glad honored of this.
As I am going to release another major version, any suggestions, bug reports, feature requests, and general feedback of yours would be invaluable for me to understand how Farkle is used. If you have any spare time, you can try out Farkle's latest version at its bleeding edge NuGet feed at https://ci.appveyor.com/nuget/farkle
.
In any case, let me know if everything went well with your parser. Thanks for using Farkle and good luck with the Advent of Code challenge. ✌🏻
Thanks for your help @teo-tsirpanis . I was able to get my unit tests to pass using the code you provided. You can find my solutions thus far at https://github.com/pmarflee/AdventOfCode2020FSharp. I expect I'll be making further use of Farkle to help with parsing on the remaining problems.
I'm trying to write a parser for the data format described in Day 4 of the Advent of Code 2020 problems. The following code works OK on the contents of a file which ends in 2 newline characters.
However I also want to recognise EOF as a separator so that I don't have to put 2 newline characters at the end of the file in order for Farkle to parse it. FParsec seems to have an inbuilt parser for EOF, but I can't figure out how to do it with Farkle. How can I write a parser that captures EOF?