Open sliekens opened 9 years ago
The suggested fix has its own issues...
Read
methods to the ITextScanner
interface does not make it compatible with APIs that operate on a System.IO.Stream
.Suggested alternative fix: add a property ITextScanner.BaseStream
. The implementation of that property should return a wrapper around the underlying stream. The wrapper should implement the following behavior:
Write()
/Seek()
should throw NotSupportedException
Read()
& friends should notify the scanner of what was read (using callbacks or events)Dispose()
method should remove its own references to the underlying stream, but not close it.
More ideas...
Instead of the BaseStream
property, add a method ITextscanner.ReadRaw(Action<Stream>)
The implementation of this method is responsible for managing the lifetime of objects
void ReadRaw(Action<Stream> callback)
{
using (Stream wrapper = new WrapStream(inputstream, this))
{
callback(wrapper);
}
}
This way, the scanner object has full control over all objects: the wrapper stream becomes unusable after the callback method returns.
TODO: figure out what pattern that the WrapStream
should use for notifying the scanner object when it is read.
Since ReadRaw(callback)
lets the caller read data out of context, which may be binary data or character data in a different encoding, is there any meaningful way to maintain the scanner's internal state?
I think that a ReadRaw
action should always force the scanner back to its pre-initialized state.
Discarding internal state is the only way to prevent integer overflow when the number of raw bytes read is greater than int.MaxValue
(2GB).
I added two members to ITextScanner
ITextScanner.Reset()
ITextScanner.BaseStream
The Reset()
method sets the internal state to the pre-initialized state and releases any internal buffers that it may hold. This method should always be safe to call in between reads.
The BaseStream
property returns a direct reference to the underlying stream. Callers should take care not to dispose this stream, and to call Reset()
before attempting to read from this stream.
At this time, the
ITextScanner
interface does not provide APIs for binary data.The workarounds are:
OctetLexer
to read any number of bytes as instances ofElement
The first workaround is nasty for a number of reasons:
OctetLexer
class creates an instance of theElement
class for every byte.OctetLexer
class converts each byte toSystem.String
; you have to convert it back to a byte usingITextScanner.Encoding
.The second workaround isn't much better:
Suggested fix: copy the
Read
andReadByte
+ async variants fromSystem.IO.Stream
to theITextScanner
interface. Implement these methods in a way that updates the scanner's internal state.