Closed dstroy0 closed 2 years ago
we can just
memchr and memcmp
// pseudo
// look for interesting "char"
ptr memchr input _delim_[0] input_len
if ptr != NULL // found interesting "char"
result = memcmp _delim_ ptr sizeof _delim_len_ // if result == 00, match
find a delimiter in input, add null to token string, get the difference between (ptr + delim) to next delimiter or end of input, memcpy the token to token string, add null
char *ptr = strchr((char*)data, _delim_[0]);
size_t pos = 0;
while (pos < len)
{
if (pos == len)
{
break;
}
if (ptr == NULL)
{
break;
}
if (memcmp(_delim_, ptr, _delim_len_) == 0)
{
Serial.print(F("found delim "));
pos = ptr - (char *)data;
Serial.println(pos);
ptr = (char*)(data + pos + _delim_len_);
}
ptr = strchr(ptr, _delim_[0]);
}
this works, we can break this out into its own function and pattern match whatever we want to now
char *ptr = (char*)memchr((char*)data, _delim_[0], len);
size_t pos = 0;
size_t prev_pos = 0;
size_t token_len = 0;
while (pos < len)
{
if (ptr == NULL || pos == len)
{
break;
}
// delim test
if (memcmp(_delim_, ptr, _delim_len_) == 0)
{
Serial.print(F("found delim "));
pos = ptr - (char *)data;
token_len = pos - prev_pos - 1;
Serial.println(pos);
Serial.print(F("token_len "));
Serial.println(token_len + 1);
//memcpy(_token_buffer_ + token_buffer_index, (ptr - token_len), token_len);
//_data_pointers_[_data_pointers_index_] = &_token_buffer_[token_buffer_index];
//_data_pointers_index_++;
//token_buffer_index += token_len + 1U; // null sep
ptr = (char*)(data + pos + _delim_len_);
prev_pos = pos;
}
ptr = (char*)memchr(ptr, _delim_[0], (len - pos));
}
first I will scan for c-string delimiters and delimiters, if reg delim are before the c-string, get tokens until the c-string, then get the whole c-string token, else vice versa, continue...
I pushed a commit with the beginnings of this
I think that I need to just combine the bytewise scan and memcmp. I want to avoid repeat scans.
Ok, I refactored getToken's behavior with the intention of giving users the option of using it to parse input into a token string, separated as specified, csv or whatever, with pointers to each token.
we scan bytewise until encounter an interesting character, then memcmp if the interesting character sequence is greater than 1, if there's a delimiter match, put a separator in the token buffer,
It's working, decomposes the input into the token_buffer with a char sep.
for any delimiter > 1 looping strstr will be faster than bytewise scan