Add (nonblocking) read(buf, len) to Stream class

arduino / ArduinoCore-API

Hardware independent layer of the Arduino cores defining the official API

https://www.arduino.cc/reference/en/

GNU Lesser General Public License v2.1

208 stars 117 forks source link

Add (nonblocking) read(buf, len) to Stream class #54

Open matthijskooijman opened 10 years ago

matthijskooijman commented 10 years ago

Currently, the Stream class only has a (virtual) read() method to read one character at a time. It does have a readBytes() method to read multiple bytes at a time, but hat one blocks (with a timeout) and is not virtual, so it resorts to single-character reads.

It would be good to have a virtual multiple-byte reading method, which allows:

Callers to easily read multiple bytes into a buffer
Stream implementations to do efficient multiple-byte reads (e.g. prevent one SPI transfer per byte)

The Client class already has exactly this method, probably for the above reasons. Adding a virtual version of it to Stream means that Client implementations will support it out of the box.

We can also add a default implementation to the Stream class (using the existing read() method), that should keep all existing Stream implementations working. The default implementation would be something like:

Something like this (untested):

int read(uint8_t *buf, size_t size) {
    if (!available())
        return -1;
    int res = 0;
    while((int c = read()) >= 0 && size--)
        buf[res] = c;
    return res;
}

Note that this has a slightly peculiar return value: it returns -1 when there is no data, to match the (Ethernet)Client implementation (which returns 0 when the connection is closed). I guess it would make sense swapping these, but that would break backward compatibility for the Ethernet and (to a lesser degree, see arduino/Arduino#2251) the Wifi library.

How does all this sound?

PaulStoffregen commented 10 years ago

Why not just make readBytes() virtual? When the timeout is set to zero, it's the same as this proposed function.

millerabel commented 10 years ago

The proposed implementation above could be improved in a couple of small ways: (by replacing the while loop in the above code...)

// Don't read if there is no place to put the byte:
while (size-- && (int c = read()) >= 0) {

    // Increment res after each byte is stored:
    buf[res++] = c;

    // Don't block when we run out of bytes:
    if (!available())
        break;
}

With these changes,

If no bytes available before calling, returns -1.
Otherwise, returns the actual number of bytes that were available(), read() and stored in buf.
If called with size == 0, returns zero without calling read().

Improve readBytes() by making it virtual, to allow redefinition, and with the above as the zero timeout behavior. There should remain a blocking variant of readBytes() that does not accept a timeout parameter.

matthijskooijman commented 9 years ago

Why not just make readBytes() virtual? When the timeout is set to zero, it's the same as this proposed function.

I don't really like this approach, but hadn't thought about why until now. Two problems I can see:

If you want to read multiple bytes from within a library, you have to modify the timeout that the user may have set. I don't really like a "change, read, change back" approach in the first place, but I don't think there is even a way to get the current timeout.
Every subclass that implements the efficient multiple-read thing, will also have to re-implement the timeout functionality. Likely to cause small differences in implementation as well.

Hence I'd vote for adding a new virtual method instead.

If no bytes available before calling, returns -1.

I don't really like this - there's no point to it except backward compatibility, which I'd be willing to break for 1.6.0. For ethernetclient, I don't think there is any point in distinguishing between "no data" and "connection closed" (if an application cares, it can just check EthernetClient::connected()). Applications can just check for <= 0 if they want to support 1.6 and pre-1.6, only problem would be programs that check for == -1 or < 0 now.

 // Don't block when we run out of bytes:
if (!available())
    break;

I don't think this is needed, calling read() doesn't block either.

while (size-- && (int c = read()) >= 0) {

I don't really like decrementing size before we're sure we actually have something to read, since the value of size will be wrong after the loop. IT doesn't really matter since it is not used there, but for future-proofness, how is this?

size_t read(uint8_t *buf, size_t size) {
    size_t res = 0;
    while (res < size && (int c = read()) >= 0)
        buf[res++] = c;
    return res;
}

I think this has all the points you mentioned, except the -1 return value.

I think

millerabel commented 9 years ago

I think this proposal improves on the prior proposals. I agree with your reasoning, and it is tighter for sure. Thank you.

However, none of the proposals that rely on repeatedly calling read() are appropriate for bus-based systems. In SPI for example, it is inefficient to establish and release bus select, and to re-address each byte transfer individually.

For the NFC library and application I'm developing, I must maintain > 1 Mb/s transfer rate between the Arduino and target. And the target relies on chip select continuity to define the boundaries of continuous addressing. The hardware is fully capable of supporting this (I've run it up to 4 Mb/s). The standard io library ought to support a virtual bulk transfer method that can be overridden so it's not dependent on repeated read() calls.

In overridden virtual readBytes() / writeBytes() / transferBytes() methods I would:

Establish bus master, then transfer all buffered bytes, (bidirectionally in the case of transferBytes, optionally using the same buffer for outgoing and incoming byte flow), then release the bus.

Overwriting readBytes(), writeBytes(), or transferBytes() would permit this implementation. And in my library, I would redefine read() in terms of readBytes as a special case, reversing the implementation dependency for this one type of bus. Same for write() in terms of writeBytes().

There would be no externally visible API difference, but performance would be greatly improved. So higher level libraries that depend on read() and readBytes() continue to work.

A reach goal would be to support the bulk transferBytes() also asynchronously, if a standard Arduino library pattern emerges for that (direct call to setup transfer, then with IRQ continuation support). I'll leave that to a separate discussion.

matthijskooijman commented 9 years ago

I'm not sure I understood everything you said on first reading (no more time now), but the idea was to make the new read function virtual, so it can be overridden. The above implementation is just the default implementation for existing subclasses / subclasses that do not care. Would that sufficiently address your usecase?

millerabel commented 9 years ago

Yes, if the new "size_t read(uint8_t *buf, size_t size)" is virtual, and there is a similar virtual writeBytes function, I can support my use case. (I write readBytes as a shorthand for this method signature).

I would then override the single byte read(uint8_t c) in terms of the overridden readBytes function. Same for write and writeBytes.

I will have to further study how to correctly implement the timeout behavior in a bus transfer.

PaulStoffregen commented 9 years ago

If you want to read multiple bytes from within a library, you have to modify the timeout that the user may have set. I don't really like a "change, read, change back" approach in the first place, but I don't think there is even a way to get the current timeout.

Are there really compelling use cases where both a library and the main sketch would need to use readBytes(), with different timeout settings?

Nearly all the libraries that use a Stream pointer or reference are designed to manage all the communication on that port. Bridge, Xbee, Firmata, are just a few examples. Are there any existing and widely used libraries that realistically need readBytes() with a configure, read, reconfigure approach to allow the sketch to also read from the same Stream?

Every subclass that implements the efficient multiple-read thing, will also have to re-implement the timeout functionality. Likely to cause small differences in implementation as well.

This is equally true for read() or readBytes(). Today, Print class objects that implement block write have their own implementations, if they don't use the fallback one.

lasselukkari commented 3 years ago

I really would like to see this fixed. I have a class that takes Stream as a input. My own class is also derived from the Stream. Most of the the time the the wrapped Stream given as input is actually a Client but now I can not use the version of read that would be using the buffer and be a lot more performant. I can go around the problem by wrapping the input Stream to a Client class that has all the Client specific functions such as connect stubbed. This way the input could be a Stream or a Client but this feels stupid.

In my opinion it does not make any sense that the Client and Stream class would have any differences in the read and write functions. The Client should just extend the Stream with the IP related functions. I should not need to use the Client instead Stream to be able to read properly.

lasselukkari commented 3 years ago

@aentinger and @facchinm what do you think? Could this be fixed. The changes would be completely backwards compatible.