$AIVDM, etc - Githubissues

kimgr commented 2 months ago

Hey, thanks for a nice and clean NMEA parser library!

We run into occasional malformed messages with encapsulated sentences prefixed with non-encapsulated start-tokens. Here's an excerpt from real-world data:

$AIVDM,1,1,,A,13Ho=R7000Of7DbKI<etrRbT1000,0*6B
$AIVDM,1,1,,A,13Ho=R7000Of7DbKI<etrRbT1>@<,0*19
$AIVDM,1,1,,A,13Ho=R7000Of7DbKI<etrRbT1>`<,0*39
$AIVDM,1,1,,A,B3HvLWP00?sQHD6nC?wQ3wUUn006,0*2D
$AIVDM,1,1,,A,B3HwU?00;OsPQ4Vn;=hD;wQ5n006,0*52
$AIVDM,1,1,,A,B3I2kGP0;OsQhAVmnBC4Wwi5n006,0*5B
$AIVDM,1,1,,A,B3I7Dt004gsP9KVn6TucwweQl000,0*3F
$AIVDM,1,1,,A,B3I7Dt009gsP;1Vn5>;M3wc1l000,0*61
$AIVDM,1,1,,A,B3I7Dt00;?sP=K6n7Fl;WwWQl000,0*4C

These messages are formally wrong, they should have ! as the leading character.

I was considering adding a flag to SentenceParser to allow this anomaly. Thoughts? We might also look into preprocessing messages to replace $ by ! where applicable.

Thanks!

icholy commented 2 months ago

Sorry, I don't think we're open to change like that. This type of pre-processing is something that can be trivially done outside of the library.

kimgr commented 2 months ago

Fair enough. In the presence of v4 tags, however, the preprocessing is a little bit less trivial. It would be nice not to have to parse before the parser.

icholy commented 2 months ago

@kimgr would this make your life easier? https://github.com/adrianmo/go-nmea/pull/112

kimgr commented 2 months ago

@icholy Thanks, it should! Let me play around with that a little bit and see how it works out in context. I'll get back to you.

kimgr commented 1 month ago

I think ParseTagBlock is a great idea, I believe we have a few use cases where that's all we need from a line.

I do think the API for two-step parsing becomes a little lopsided with that in the mix:

func TestPatchSentenceStart(t *testing.T) {
    raw := "\\s:somewhere,c:1720289719*4D\\$AIVDM,1,1,,A,13Ho=R7000Of7DbKI<etrRbT1000,0*6B"
    tags, rest, ok, err := ParseTagBlock(raw)
    assert.NoError(t, err)
    assert.Equal(t, true, ok)
    assert.Equal(t, "somewhere", tags.Source)
    assert.Equal(t, int64(1720289719), tags.Time)
    assert.Equal(t, "$AIVDM,1,1,,A,13Ho=R7000Of7DbKI<etrRbT1000,0*6B", rest)

    // Pretty brutal, but ¯\_(ツ)_/¯
    rest = "!" + rest[1:]

    // Parse the tag-less NMEA sentence
    m, err := Parse(rest)
    assert.NoError(t, err)

    vdm, ok := m.(VDMVDO)
    assert.Equal(t, true, ok)
    assert.Equal(t, "A", vdm.Channel)
    // Hmm, TagBlock is not set (rest has no tags)
    assert.Equal(t, TagBlock{}, vdm.TagBlock)

    // ???
    vdm.BaseSentence.TagBlock = tags
}

It's a bit inconvenient to get the tags back into the parsed message TagBlock.

kimgr commented 1 month ago

What I had in mind was essentially:

diff --git a/sentence.go b/sentence.go
index 2619275..fc023d7 100644
--- a/sentence.go
+++ b/sentence.go
@@ -103,6 +103,8 @@ type SentenceParser struct {
        // \g:2-3-1234*hh\!ABVDM,1,1,1,B,.....,0*hh
        // \g:3-3-1234*hh\$ABVSI,r3669961,1,013536.96326433,1386,-98,,*hh
        OnTagBlock func(tagBlock TagBlock) error
+
+       AllowSloppySentenceStart bool
 }

 func (p *SentenceParser) parseBaseSentence(raw string) (BaseSentence, error) {
@@ -437,7 +439,7 @@ func (p *SentenceParser) Parse(raw string) (Sentence, error) {
                        return newXTE(s)
                }
        }
-       if s.Raw[0] == SentenceStartEncapsulated[0] {
+       if s.Raw[0] == SentenceStartEncapsulated[0] || (p.AllowSloppySentenceStart && s.Raw[0] == SentenceStart[0]) {
                switch s.Type {
                case TypeABM:
                        return newABM(s)

... but yeah, I can see how that opens a potential can of worms with workaround flags for all sorts of weird behavior.

edit: sorry, pasted the wrong diff

aldas commented 1 month ago

That AllowSloppySentenceStart looks quite messy. There will be probably in future another person that has stumbled upon some incorrect NMEA0183 spec implementation and we have another flag for another workaround. @icholy proposal guards the library from this problem and allows user to "compose" the parsing.

@icholy did you consider adding function like SentenceParser.ParsePrefix for ParseTagBlock? so that user can provide their own implementation for composed SentenceParser ... so it can implement their own parsing logic or in this case "fix" things before passing rest of the sentence.

If "new" function for sentences would be public you could add your own parser that do not care if $ or ! is the prefix as custom parser has precedence over default ones and custom parsers do not care about sentence types (regular/encapsulated).

This is another hack that relies on parser internals. I think I proposed this in some older PR variation. I think my idea was to you could compose sentence parser that only supports only handful of sentences (some provided by this library).

package nmea_test

import (
    "github.com/adrianmo/go-nmea"
    "github.com/stretchr/testify/assert"
    "testing"
)

func TestCustomVDMParser(t *testing.T) {
    parser := nmea.SentenceParser{
        CustomParsers: map[string]nmea.ParserFunc{
            "VDM": func(s nmea.BaseSentence) (nmea.Sentence, error) {
                return nmea.NewVDMVDO(s) // <-- `nmea.newVDMVDO` is currently private
            },
        },
    }

    result, err := parser.Parse("\\g:1-2-73874,n:157036,s:r003669945,c:1241544035*4A\\$AIVDM,1,1,,B,15N4cJ`005Jrek0H@9n`DW5608EP,0*13")

    assert.NoError(t, err)
    assert.Equal(
        t,
        nmea.VDMVDO{
            BaseSentence: nmea.BaseSentence{
                Talker:   "AI",
                Type:     "VDM",
                Fields:   []string{"1", "1", "", "B", "15N4cJ`005Jrek0H@9n`DW5608EP", "0"},
                Checksum: "13",
                Raw:      "$AIVDM,1,1,,B,15N4cJ`005Jrek0H@9n`DW5608EP,0*13",
                TagBlock: nmea.TagBlock{
                    Time:         1241544035,
                    RelativeTime: 0,
                    Destination:  "",
                    Grouping:     "1-2-73874",
                    LineCount:    157036,
                    Source:       "r003669945",
                    Text:         "",
                },
            },
            NumFragments:   1,
            FragmentNumber: 1,
            MessageID:      0,
            Channel:        "B",
            Payload:        []uint8{0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x0, 0x1, 0x0, 0x1, 0x0, 0x1, 0x1, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x1, 0x0, 0x1, 0x0, 0x1, 0x1, 0x0, 0x1, 0x1, 0x0, 0x1, 0x0, 0x1, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x1, 0x0, 0x1, 0x1, 0x0, 0x1, 0x0, 0x1, 0x1, 0x1, 0x0, 0x1, 0x0, 0x1, 0x0, 0x1, 0x1, 0x0, 0x1, 0x1, 0x1, 0x0, 0x0, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x1, 0x1, 0x1, 0x0, 0x1, 0x1, 0x0, 0x1, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x1, 0x0, 0x0, 0x1, 0x0, 0x0, 0x1, 0x1, 0x1, 0x0, 0x0, 0x0, 0x1, 0x0, 0x1, 0x0, 0x0, 0x0, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x1, 0x0, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0},
        },
        result,
    )
}

icholy commented 1 month ago

@kimgr I changed ParseTagBlock to return the sentence index. You can use it like this:

func TestPatchSentenceStart(t *testing.T) {
    raw := "\\s:somewhere,c:1720289719*4D\\$AIVDM,1,1,,A,13Ho=R7000Of7DbKI<etrRbT1000,0*6B"

    // fix sentence
    _, n, err := ParseTagBlock(raw)
    assert.NoError(t, err)
    if strings.HasPrefix(raw[n:], "$AIVDM") {
        raw = raw[:n] + "!" + raw[n+1:]
    }

    // parse
    m, err := Parse(raw)
    assert.NoError(t, err)
    vdm, ok := m.(VDMVDO)
    assert.Equal(t, true, ok)
    assert.Equal(t, "A", vdm.Channel)
    assert.Equal(t, "somewhere", vdm.TagBlock.Source)
    assert.Equal(t, int64(1720289719), vdm.TagBlock.Time)
}

icholy commented 1 month ago

@aldas I'd prefer to keep those parse function private if we can.

kimgr commented 1 month ago

@icholy I'd like to avoid parsing the tags twice here. It's been good to see some examples of how you guys would approach the problem. Let's mull on this for a bit, and I'll see if I can find inspiration for something principled. Thanks for being so forthcoming!

aldas commented 1 month ago

that double tag parsing note is valid.

When it seems to come down to "fixing" (base)sentences that are somewhat "OK" but need a little bit tweaking to get over the finish line. What about something akin to middleware function that is run before customparsers and default parsers are run and allows you to tweak BaseSentence before it is used. SentenceParser has already handler and "On*" functions to hook into/change the flow. http.Server has functions like that.

    parser := nmea.SentenceParser{
        BaseSentence: func(sentence nmea.BaseSentence) nmea.BaseSentence {
            if sentence.Raw[0] != '!' && sentence.Type == nmea.TypeVDM {
                sentence.Raw = "!" + sentence.Raw[1:]
            }
            return sentence
        },
    }
    result, err := parser.Parse("\\g:1-2-73874,n:157036,s:r003669945,c:1241544035*4A\\$AIVDM,1,1,,B,15N4cJ`005Jrek0H@9n`DW5608EP,0*13")

but this would not help with cases when incorrectly implemented NMEA0183 spec fails inside SentenceParser.parseBaseSentence logic. I can not imagine what it would be - field separators, sentence start/end symbols? Not everything can or should be fixable.

icholy commented 1 month ago

@kimgr @aldas can you elaborate on the problem with parsing the tagblock twice?

kimgr commented 1 month ago

@icholy It just rubs me the wrong way in general not to use information I've already spent CPU cycles obtaining :)

The frequency of $-prefixed messages is very low compared to total message volume, so it feels bad to make the general case slower for the benefit of the outliers.

kimgr commented 1 month ago

@aldas That parser hook makes a lot of sense to me.

I'm also not interested in fixing all imaginable malformed NMEA, these $-prefixed VDM messages just stood out to me in our metrics as something that was obviously close enough.

kimgr commented 1 month ago

So, any chance for a BaseSentence parser hook? I can prepare a small PR if it sounds like a good idea.

icholy commented 1 month ago

I think the API should look like this:

type SentenceParser struct {
    // ...

    // OnBaseSentence is a callback for accessing/modifying the base sentence
    // before further parsing is done.
    OnBaseSentence func(sentence *BaseSentence) error
}

icholy commented 1 month ago

I implemented the OnBaseSentencecallback in https://github.com/adrianmo/go-nmea/pull/114 but I'm not sure how useful it is.

kimgr commented 1 month ago

Thank you! #114 LGTM, the motivating testcase is great.

adrianmo / go-nmea

$AIVDM, etc #111