would it be possible to have a zero-length delimiter?

2bndy5 commented 2 years ago

Using the UART protocol entails a lot more performance overhead (per byte) than other data bus protocols. I was wondering if this lib could conceivably be used to parse each byte (signed or unsigned) from a stream (without using a delimiter) as an arg.

To be clear, I'm thinking in terms of raw bytes, thus no null byte terminator for the delimiter.

dstroy0 commented 2 years ago

If there were no delimiters it would be treated as a single arg, we could make it so that something like that passes through and ends up accessible by a class pointer? It would be your_command whatever_the_argument_is eol so only 3 extra char on top of the command+argument.

2bndy5 commented 2 years ago

if I understand this correctly, then It could be done if I roll the callback parser properly.

so only 3 extra char on top of the command+argument.

Are you trying to say

<CMD><ByteArrayArg>\r\n\0

The line endings are typically configurable to suite the OS behavior. Most assume Unix line endings (\n aka "new line"). This is why the Serial monitor in the Arduino IDE provides the drop-down options The \r (aka "carriage return") is an artifact from the days before GUI-based OSs (dominated by IBM at the time). Unfortunately, Windows still defaults to using \r\n for line endings while Linux (& MacOS I think) use \n.

Technically, my scenario could be using the binary representation of \r (aka 0xD) and/or \n (aka 0xA) within the bytearray arg.

dstroy0 commented 2 years ago

Yes but I think it would still need the c-string delimiters. <CMD><\"ByteArrayArg\">\r\n\0 Really you can set your term to whatever, I set it to \r\n by default because that's what lots of arduino examples are. Another thing we could do is to tell it to read a certain number of bytes and then try to get tokens regardless if it received a terminating character or not. This behavior would only be feasible for GetCommandFromStream since ReadCommandFromBuffer gets the entire buffer from whatever interface, not bytewise. We can add a new parameter with a default of 0 to read until term or n to read n bytes, before the buffer size, and then we can enforce the rx buffer be that size at a minimum.

2bndy5 commented 2 years ago

Would it be possible to do away with the quotes by adding in a new UITYPE? Each quote is a byte and each EoL term is 1-2 bytes.

I like the read n bytes idea, but I think the user would have to flush the RX buffer when desirable.

2bndy5 commented 2 years ago

I seemed to have proposed a bit of a quagmire here. I'll take some time to familiarize myself with the src better (might take a week or so - I've got other projects in motion).

You got me hooked... I'm unreasonably excited to parse GPS sensors' NMEA sentences with this lib! 🤓

dstroy0 commented 2 years ago

We could make it work for sure, all that you need to know beforehand is the command length. Maybe with an overloaded GetCommandFromStream constructor we can add a switch to read the command's strlen bytes off the stream into the rx_buffer, append a null, then append the rest of the message and then a null, then use the existing methods on rx_buffer.

2bndy5 commented 2 years ago

Took a close look at the src last night. This lib is heavily reliant on the input as a string. So I don't see a way to not use a null byte. ~The EoL term(s) are hard coded const, so users would have to go into lib source to change that.~

I have an objection to 100+ line function definitions, so it seems some refactoring is very possible. I had a vague idea to convert the UITYPE enum into a set of inherited classes that would house the methods for extracting the number of bytes (according to datatype) from the input. I may not have worded that idea right.

I'm used to thinking in terms of raw bytes where a string is just a bytearray with extra implementation details. There are more rudimentary alternatives to c-str functions (strcmp vs memcmp, strcat vs memcpy or memmove), so I think it's possible to factor out the dependence on c-strings (except where the user explicitly expects them). Although, I do understand the initial approach to use c-strings because they're so widely used in C.

I'll have another look at the src soon...

dstroy0 commented 2 years ago

I've been thinking about this more. I think the most understandable approach would be to compare the first n bytes of the input where n is the zero-delim command length using . If there's a zero-delim command match then pop it into token_buffer separating the command and data with null, then we can use existing methods to process it or pass it through. The major downside of this method is that it reintroduces a lot of scanning. Thoughts?

2bndy5 commented 2 years ago

I'm not sure I completely understand everything you said. To identify the cmd, it uses memcmp(), right?

compare the first n bytes of the input where n is the zero-delim command length using .

I think this sentence was meant to end with a word or code snippet, but it isn't rendered.

separating the command and data with null

Are you suggesting that the input should have a null byte delimiter? This would require 1 byte which is not what the original question of this thread asked.

dstroy0 commented 2 years ago

Right we would memcmp the command length bytes for the zero delimiter commands and then copy the command and data into the token_buffer. So the original input can have zero delimiter, because we know exactly when the data starts, and then we chop it up ourselves into two pieces and copy those into the token buffer.

2bndy5 commented 2 years ago

I don't see a problem with that. It would allow a bytearray be the entire contents of a struct.

struct MyPayload {
    int myInt;
};
MyPayload payload;

// Later in the parser callback

memcpy(&payload, token, sizeof(payload));

dstroy0 commented 2 years ago

Yes, exactly!

dstroy0 commented 2 years ago

I started on this today, pushed out a first commit for this and for the beginnings of showing users how to use the public methods to decompose their own c-strings (NMEA sentences).

dstroy0 commented 2 years ago

The functionality is available, it adds one input scan of command_length times the number of zero delimiter commands, the private method UserInput::_splitZDC is what does the work, I haven't tested it yet but it is an integral part of parsing NMEA sentences so it will be working as that gets filled in enough to test if it doesn't already.

dstroy0 commented 2 years ago

this is confirmed working in the NMEAparse example

dstroy0 / InputHandler

would it be possible to have a zero-length delimiter? #7