JuliaIO / BufferedStreams.jl

Fast composable IO streams
MIT License
42 stars 20 forks source link

peek resets the anchor #63

Open feanor12 opened 2 years ago

feanor12 commented 2 years ago

I tried the anchor example found in the documentation and it does not output anything.

julia> t = join(rand([collect('a':'z')... collect('0':'9')...],100))
"oxfj939xjpifeaa0ngk97yu6tywg3syu066ynxfsfnnsh3fhxc0osv7zih8ag3k08cp59upjxpyb8ibdpyx620wbapppmiqng9c1"

julia> stream = BufferedInputStream(IOBuffer(Vector{UInt8}(t)),6)
BufferedInputStream{IOBuffer}(<6 B buffer, 0% filled>)

julia> while !eof(stream)
           b = peek(stream)
           if '1' <= Char(b) <= '9'
               if !isanchored(stream)
                   anchor!(stream)
               end;
           elseif isanchored(stream)
               println(takeanchored!(stream))
           end
           read(stream, UInt8)
       end

I also had to add the Char to the if statement.

In addition, I noticed that peek resets the anchor.

julia> t = join(rand([collect('a':'z')... collect('0':'9')...],100))
"osiaanxkireq3mknd8gakx3g5uwnu2mkxdw6h6tyc6s5m5nhitgle6nb0iq7jyeksbj527wmp0dtlq0mj9kn3zbvlw49u92eeqhh"

julia> stream = BufferedInputStream(IOBuffer(Vector{UInt8}(t)),6)
BufferedInputStream{IOBuffer}(<6 B buffer, 0% filled>)

julia> peek(stream)
0x6f

julia> stream.buffer
6-element Vector{UInt8}:
 0x6f
 0x73
 0x69
 0x61
 0x61
 0x6e

julia> anchor!(stream)
1

julia> isanchored(stream)
true

julia> peek(stream)
0x6f

julia> isanchored(stream)
false

Version: [e1450e63] BufferedStreams v1.0.0

stevengj commented 1 year ago

I can't reproduce your test case. The final line gives

julia> isanchored(stream)
true

for me.

peek only resets the anchor if the buffer needs to be refilled — as I understand it, any function that reads from the stream might need to refill the buffer, and hence might reset the anchor.

Why is this a problem?

feanor12 commented 1 year ago

It breaks the usecase where takeanchored! can be used to extract tokens from a stream as the anchor position should always be kept in the buffer. Does a refilled buffer still contain the data after the anchor? Maybe I did not get the usecase intended for the anchor mechanic.

stevengj commented 1 year ago

Actually, I take it back. peek and other read operations should never remove the anchor or the anchored data, even when the buffer needs refilling. It will resize the buffer if needed in order to preserve the data after the anchor. (The anchor and corresponding data may be moved to the beginning of the new buffer, however.)

As I said, I can't reproduce the problem you reported above. I also don't have any problem in cases where the buffer needs to be refilled:

julia> io = BufferedInputStream(IOBuffer("foobarbaz"), 3)
BufferedInputStream{IOBuffer}(<3 B buffer, 0% filled>)

julia> peek(io)
0x66

julia> anchor!(io)
1

julia> b = Vector{UInt8}(undef, 3); readbytes!(io, b); String(b)
"foo"

julia> peek(io)
0x62

julia> isanchored(io)
true

julia> b = Vector{UInt8}(undef, 3); readbytes!(io, b); String(b)
"bar"

julia> String(copy(io.buffer)) # buffer was enlarged, still contains anchored data "foo"
"foobar"

Can you provide a reproducible example of your problem?

feanor12 commented 1 year ago

I tried with Julia 1.7 and 1.9.2 on Linux and could no longer reproduce this issue. The only difference might have been the operating system. The original issue occurred on a Windows machine.

pkg> activate --temp
pkg> add BufferedStreams@v1.0.0
julia> begin
using BufferedStreams
t = "1asd22asdsad333adsadsad4444asdasdasd55555asdasd999999999a";
stream = BufferedInputStream(IOBuffer(Vector{UInt8}(t)),6);
numbers = []
while !eof(stream)
    b = BufferedStreams.peek(stream)
    if '1' <= Char(b) <= '9'
        if !isanchored(stream)
        anchor!(stream)
        end;
    elseif isanchored(stream)
        append!(numbers,[String(takeanchored!(stream))])
    end
    read(stream, UInt8)
end
numbers
end

#6-element Vector{Any}:
 #"1"
 #"22"
 #"333"
 #"4444"
 #"55555"
 #"999999999"