Unify parsing functions

icholy commented 1 year ago

@aldas what are your thoughts on this?

icholy commented 1 year ago

It's a bit less convenient to setup:

p := nmea.SentenceParser{
    Parsers: nmea.DefaultParsers(),
}

maps.Copy(p.Parsers, map[string]nmea.ParserFunc{
    "YYY": func(...) { ... },
    "ZZZ": func(...) { ... },
})

But it gives more control to the user and it cleans up the code.

aldas commented 1 year ago

SentenceParser.Parsers map[string]ParserFunc definitely looks eloquent from API perspective but as you already mentioned it is inconvenient for most of the cases. Now this raises question what are the root use-cases when we want to set all the parsers.

hypothetical situation:

REQ1) my device sends multiple sentence, one of them is this non-default type of sencence - I need to be able to parse it.

This case was easier to implement with SentenceParser.CustomParsers. With this change it takes more work.

REQ2) I need to override default type parser behaviour with my own implementations because my device does something "off the spec"

This case was easier to implement with SentenceParser.CustomParsers as custom parsers were prioritized before default types switch. With this change it takes more work.

REQ3) I want to limit sentences that my Parser is able to parse (I want to exclude some of the default parsers).

With previous version this was not possible as when CustomParsers check failed default parsers were always checked. Current version supports this. You can select small number of sentences what are parseable. Inconvenient but possible.

This is perfectly valid use-case. I was dealing just today with vessel where we installed IMU that has NMEA0183 output. That device is outputting 4 different messages which 2 of them are incorrectly implemented (missing CRC). One of those incorrect ones I am interested and second one I want to ignore/skip. For a year I have used workaround that sentences that I that I care I mark in configuration (JSON) to be fixed if they are missing CRC. All of this configurable stuff works fine as long as you are parsing everything. But today I had one sentence that I did not mark to be CRCfixed and have situation there this sentence with missing CRC ends up in nmea.Parse method and errors. I log all error - which means that now I have to deal with log spam or implement some kind of ignoring/skip logic.

So it we have definite need to able to skip or ignore some of the sentences, or at least suppress their errors. Creating parser that does not include all of the sentence that I do not care would be one way to fulfill that requirement and then error.Is check if type is unsupported and move on.

but - is it most optimal way? maybe just doing prefix check over slice of ignorable sentence is easier? It is not eloquent as defining map of supported sentences. Writing "dumb" for range if equals continue next sentence could be less complex. Nothing needs to be parsed. This is use-case where you need to act on known data (sentences) to prefix check if probably OK as prefixes are fixed/known and we do not need to worry about potentially different TalkerIDs in prefix.

I think there is underlying question - why/when we want to have limited set of parsers enabled? And how to filter sentence that are parsed.

some other remarks.

I do not think that init() is necessary here

Maybe this would be ok:

defaultSentenceParser = SentenceParser{
    CustomParsers: func() map[string]ParserFunc {
        r := DefaultParsers()
        r[TypeMTK] = newMTK // for backwards compatibility support MTK. PMTK001 is correct an supported when using SentenceParser instance
        return r
    }(),
}

func DefaultParsers() map[string]ParserFunc { is problematic as there are multiple types of parses. Sentences that start with $ are called as "parametric". Sentences that start with ! are called encapsulated.

Currently DefaultParsers lumps them all together. Old Parse(raw string) had distinction between these types if s.Raw[0] == SentenceStart[0] { and if s.Raw[0] == SentenceStartEncapsulated[0] {

icholy commented 1 year ago

func DefaultParsers() map[string]ParserFunc { is problematic as there are multiple types of parses. Sentences that start with $ are called as "parametric". Sentences that start with ! are called encapsulated.

Currently DefaultParsers lumps them all together. Old Parse(raw string) had distinction between these types if s.Raw[0] == SentenceStart[0] { and if s.Raw[0] == SentenceStartEncapsulated[0] {

@aldas This is the part I'm concerned about. Can you think of any scenarios where this would matter? I was kind of surprised when all the tests passed.

aldas commented 1 year ago

tldr: functionally does not matter.

Depending how purist library tries to be - it may matter as this is what spec says - sentences that encode should/must start with '!'. Now looking at the world - there are off-spec implementations (missing talker IDs, missing CRCs etc), probably mostly coming from non-marine industries that do not need to certify their devices. Anyway, this distinction was one of the reason I did not start to refactor with these long type switches.

p.s. maybe map[string]ParserFunc could be made a type that has (fluent) methods to add additional parsers (maybe even filter including/excluding). This way the code for initialising sentenceParse.Parser would be little bit easier to write (fluently chained on single line Parsers = DefaultParsers().Filter("gga", "rtc","rsa").Add("zzz", yyy)

icholy commented 1 year ago

Depending how purist library tries to be - it may matter as this is what spec says - sentences that encode should/must start with '!'. Now looking at the world - there are off-spec implementations (missing talker IDs, missing CRCs etc), probably mostly coming from non-marine industries that do not need to certify their devices. Anyway, this distinction was one of the reason I did not start to refactor with these long type switches.

One approach would be to include the ! or $ prefix in the map keys. For example:

parsers := map[string]ParserFunc{
    "$RMC": newRCM,
    "!VDO": newVDO,
    // etc ...
}

p.s. maybe map[string]ParserFunc could be made a type that has (fluent) methods to add additional parsers (maybe even filter including/excluding). This way the code for initialising sentenceParse.Parser would be little bit easier to write (fluently chained on single line Parsers = DefaultParsers().Filter("gga", "rtc","rsa").Add("zzz", yyy)

I originally had a func WithDefaultParsers(m map[string]ParseFunc) map[string]ParserFunc helper. It would be used like this:

p := nmea.SentenceParser{
    Parsers: nmea.WithDefaultParsers(map[string]ParserFunc{
        "YYY": func(...) {...},
    }),
}

I removed it because:

It can be easily written outside of this module.
The same thing will be easy to accomplish using the new maps package being added to the stdlib.

People have different preferences on these types of things. I for one, really dislike fluent apis in Go. Users of nmea.SentenceParser are already "power-users", so I don't think they need the extra hand-holding. Providing the building blocks is enough.

aldas commented 1 year ago

I think some easy way to compose the parsers list you are going to use (filter DefaultParsers) would be useful or sticking to CustomParsers would make users life easier.

Sentences what you are not parsing (are not receiving from nmea0183 device) - you do not care if parser supports it or not. This change makes no difference for this case.
Sentences that you want to parse and are not supported by library - you add you own. This change is more inconvenient because now you need to merge you custom parsers with DefaultParsers output.
Sentences that you want to parse and are all included by library (no custom ones) - you do not need to do anything. This change makes no difference for this case.
Sentences that you do not need to parse but are read from device - now you can exclude them from DefaultParsers result before you add it to SentenceParser.Parsers. This change allows you to manipulate list of parser that Parse will use. But if this is inconvenient you might as well write your exclusion check before parser.Parse is called at all and do HasPrefix check over your inclusion/exclusion list and decide do parse or continue straight to next line read from device. Actually SentenceParser.ParsePrefix callback is probably first place where you could add your exclusion logic as it has BaseSentence as argument, that you can check and return specific error if needed to bail out of further parsing.
If you do not care if you happen to parse sentence that you do not need (discard them after parse) - this change does not give you any benefits.

I tried to argument is respect of these "dimensions" vs SentenceParser.Parsers vs SentenceParser.CustomParsers :

user added parsers (custom parsers)
library provided parsers (default parsers)
specific sentences you want to parse
specific sentences that you do not want to parse
I want to parse all / I do not care what I parse
sentences that are not being sent by device
sentence that are sent by device

aldas commented 1 year ago

maybe just have both .Parsers and .CustomParsers first check would be from CustomParsers map and if it does not have that parsen then from Parsers. In that case you need to do extra work on initialization when you want to customize default parsers.

It is not that "eloquent" but is easier to use.

icholy commented 1 year ago

@aldas I want to take a break from talking about the ergonomics of adding custom functions and discuss the parser lookup logic I just added. I think it solves the "correctness" problem.

Another approach would be to have some sort of placeholder for the talker id in the map key. For example:

// Parsers returns the default sentence parsers.
func Parsers() map[string]ParserFunc {
    return map[string]ParserFunc{
        "$__RMC":   newRMC,
        "$__AAM":   newAAM,
        "$__ACK":   newACK,
        "$__ACN":   newACN,
        "$__ALA":   newALA,
        "$__ALC":   newALC,
        "$__ALF":   newALF,
        "$__ALR":   newALR,
        "$__APB":   newAPB,
        "$__ARC":   newARC,
        "$__BEC":   newBEC,
        "$__BOD":   newBOD,
        "$__BWC":   newBWC,
    }
}

Then the lookup would be something like this:

if parse, ok := parsers[start+strings.Repeat("_", len(s.Talker))+s.Type]; ok {
    return parse(s)
}

edit: I've went ahead and pushed the placeholder code

aldas commented 1 year ago

hey, sorry for the late reply.

I am fine with Parsers map[string]ParserFunc

I am not fond of init function. why not just initialize with anonymous function that returns map[string]ParserFunc , calls Parsers inside and adds MTK

defaultSentenceParser = SentenceParser{
    Parsers: func() map[string]ParserFunc {
        p := Parsers()
        // for backwards compatibility support MTK. PMTK001 is correct an supported when using SentenceParser instance
        p["$PMTK001"] = newMTK
        return p
    }(),
}

Lets not implement that placeholder logic for Parsers . We already have many ways we derive from NMEA0183 spec. There is no need to make it more complex with placeholders. NB: I do not think these placeholders work with Query type Parser (it has special logic for talker).

icholy commented 1 year ago

@aldas let's just leave SentenceParser as-is for now. I think the placeholder idea has some merit, but I need to iterate on it some more.

aldas commented 1 year ago

@icholy maybe what you are looking for is ability customize Parse(raw string) (Sentence, error). In that way user can implement their own logic that can use talkerID+type etc.

anyway comparing NMEA0183 (talkerID) to NMEA2000 (source address) I think using TalkerID + Type to choose parser is not that important or extremely unlikely usecase (where 2 different TalkerID would have different Type Parser).

aldas commented 1 year ago

having 2 sentences, using same type but having different TalkerID and different sentence structure - means that you have 2 different devices, using custom sentence types, multiplexed with NMEA0183 multiplexer/buffer device into single NMEA0183 output that user is reading. This is really really unlikely.

aldas commented 1 year ago

also. user can provide SentenceParser.ParsePrefix that handles talkerID and type extraction differently for their usecase. ala returning type as talkerID+type parts. In that case check for parser would work similarly as your placeholder logic.

adrianmo / go-nmea

Unify parsing functions #103