paulscherrerinstitute / StreamDevice

EPICS Driver for message based I/O
GNU General Public License v3.0
28 stars 42 forks source link

regex pseudo converter failing misbehaving? #97

Closed hinxx closed 3 months ago

hinxx commented 4 months ago

I have a device sending an ASCII string with fields of different types (strings, chars, decimal integer, hex integer,..), delimited by ':'. I want to parse it out into several records. A sample input might look like this: 66:+44:255:0x122:1:R:190:A03. FWIW, values '66', '+44' and '190' may change from one message to another; and take up 1 - 5 chars each.

If I do not use regex pseudo converter to convert 1 char value to a string all looks fine with this protocol:

in "%*i:%(\$1)d:%*i:0x%(\$2)03x:%(\$3)d:%(\$4)c:%(\$5)i:%(\$6)s";

$4 is setting a value of a stringin record with a single char.

Then I discovered regex pseudo converter and hoped I could convert a single char into more representative string and set a value of mbbi record, instead. Here is the protocol that does that:

in "%#/:R:/:Idle:/%#/:M:/:Ramping:/%#/:F:/:Fault:/%*i:%(\$1)d:%*i:0x%(\$2)03x:%(\$3)d:%(\$4)[a-zA-Z0-9]:%(\$5)i:%(\$6)s";

It works .. most of the time (no errors reported), but quite often I do see an error:

2024/03/07 10:39:16.783738 A03 Status: Input "A03" does not match format "%s"

Status PV is being processed every 2 s and sometimes the error happens and sometimes not. The error is always about the last token that needs to be parsed.

I've attached two dumps of debug messages, first one for successfully parsed string and the second one for the failed. stream-parsing.txt

From what I can tell the buffer that regex touches seems to overflow (sometimes?). For example this would be reported in debug for failing case:

buffer after: A03:+11:255:0x122:1:F:12585:A033<00><00><00><00><00><00><00><00>

while this is seen in case of success:

buffer after: A03<00><00><00><00><00><00><00><00>
dirk-zimoch commented 4 months ago

If you simply want to have more meaningful strings in an mbbi record, I suggest to use the %{} enum converter. It reads (the ugly) strings, and converts them to integers. Then the mbbi can take that integer and convert it to "nice" strings. E.g. %{R|M|F} together with an mbbi with ZRST="Idle", ONST="Ramping" and TWST="Fault" should give you the conversion you need. When using regular expression substitution, keep in mind that it converts the whole rest of the input. That means, if you have :R: anywhere in your input, it will be replaced with :Idle:. Make sure that this has no unintended side effects. The error confuses me, because "A03" should match %s. I will have a closer look at the debug output...

dirk-zimoch commented 4 months ago

The buffer you see is the 40 char string that should store the value "A03" before passing it to the record with dbPutField. It looks as if during conversion new data is appended. After the regex substitutions, the input buffer was 0:+11:255:0x122:1:Fault:12585:A03, which looks correct. Then the initial 7 conversions have been successful: %*i ignores the 0, %d reads the +11, %*i ignores the 255, %03x reads the 0x122 (which looks like a bug: It should read only 3 characters: '0x1'), %d reads the 1, %[a-zA-Z0-9] reads the Fault and finally %d reads the 12585. Finally, %s should read the A03. However it seems that meanwhile new data has been added to the input buffer. Now %s reads A03:+11:255:0x122:1:F:12585:A033, i.e. the remaining A03 plus new data (If it was new input, it should start with 0:+11. Where did the initial 0 go? And where does the final 3 come from?). At the end, %s consumes 32 characters but was originally called when only 3 characters were left. That inconsistency causes the error.

I need to check what happens here. It does not seem to be new input, because I see no second readHandlercall in between. Maybe something wrong happens with the buffer?

My string buffers work like a circular buffer. New data is appended at the end and consumed data simply moves the start index without moving actual content. Only when the buffer would overflow when new data is appended, remaining valid content is copied from the end to the beginning. It seems that it is not properly terminated here but merges with old content. I need to check this. Maybe related to #96.

I need to dig through my code. Anyway, thanks for spotting and reporting this!

hinxx commented 4 months ago

Thanks for the input! I'll look at the enum converter. I did a quick test by doing regex pseudo conversion from R|M|F into X|Y|Z (aka single char conversion) and I do not see any errors.

When using regular expression substitution, keep in mind that it converts the whole rest of the input.

Right, I did notice that hence the :<char>: in the regex. I also know what can be in the whole string so regex part is solid, IMHO.

However it seems that meanwhile new data has been added to the input buffer.

My guess is that it is just some junk at the end of the buffer that was there before. The Status is processed at a very slow rate (2 s) and I would not expect new string to arrive until this one was processed. In fact I think I can provoke this behavior even by doing manual / passive scan. I'll need to look in more closely what follows the A03, but I can tell right now it is not always an ASCII string (ie. I've seen nonASCII chars (and not \0 following the A03 which makes me think would break %s.

It seems that it is not properly terminated here but merges with old content.

Sounds about right. Been digging a little bit myself..

dirk-zimoch commented 4 months ago

Maybe StreamBuffer::replace has a bug if the inserted string is longer than the removed one. I am checking this right now...

dirk-zimoch commented 4 months ago

Can't find anything wrong. So this may take a while...

hinxx commented 4 months ago

As demonstrated in the PR #98 I think there are two issues at play, AFAICT. The PR is too dirty for the merge and I think you can come up with better/cleaner more complete fix.

dirk-zimoch commented 4 months ago

Thanks a lot! I will have a look.

dirk-zimoch commented 4 months ago

I think I found the actual bug.https://github.com/paulscherrerinstitute/StreamDevice/blob/211f689cdf8acaf45a53fe035a1b1d0fac8a40d7/src/StreamBuffer.cc#L277 This needs to be a < instead of <=. A <= may not leave the terminating 0x00 intact if the new content would just fit the buffer capacity, because the terminating 0x00 is not included in len/newlen. Can you test if that fixes your problem? (And the following // comment should be in the else block. I will correct that, too.) I will push an update once you confirm that is fixes the problem. Thank you for your debugging and analysis which lead me to this!

dirk-zimoch commented 3 months ago

Found another, related bug in https://github.com/paulscherrerinstitute/StreamDevice/blob/211f689cdf8acaf45a53fe035a1b1d0fac8a40d7/src/StreamBuffer.cc#L289 This needs to use len+offs instead of len because content is moved from offs to 0. This may be the actual bug that you have seen. If content grows (newlen>len) and is moved to buffer start, there is still old content to clear up to (old) offs+len if newlen is shorter than off+len.

hinxx commented 3 months ago

This needs to use len+offs instead of len because content is moved from offs to 0.

You mean like this?

if (newlen<len) memset(buffer+newlen, 0, len+offs-newlen);
dirk-zimoch commented 3 months ago

if (newlen<len+offs) memset(buffer+newlen, 0, len+offs-newlen);

dirk-zimoch commented 3 months ago

How the buffer looks like before and after:

|---junk---|##content_start##|/////////remove_this////////|##content_end##|0000|
|<- offs ->|<-- remstart --->|<--------- remlen --------->|               |    |
|          |<--------------------- len ---------------------------------->|    |
0         offs           offs+remstart                offs+remend    offs+len cap

                                                                  |<clear>|
|##content_start##|++++++++inserted_text++++++++++|##content_end##|0000000|0000|
|<--------------------- newlen ---------------------------------->|       |
hinxx commented 3 months ago

Got it. Changed and running the test. Preliminary results look good, let me run this for a couple of hours.

hinxx commented 3 months ago

I've been with running the fix over the night and I can confirm that the above two changes solve the issue for me. Thank you!

dirk-zimoch commented 3 months ago

Fixed in commit 1496089 tagged 2.8.25.