lmmx / range-streams

Streaming range requests in Python
https://range-streams.readthedocs.io/en/latest/
MIT License
8 stars 0 forks source link

Negative seek value not supported #1

Closed lmmx closed 3 years ago

lmmx commented 3 years ago
from stream_response import ResponseStream
import requests

#url = "https://raw.githubusercontent.com/lmmx/range-streams/master/example_text_file.txt"
url = "https://github.com/lmmx/range-streams/raw/bb5e0cc2e6980ea9e716a569ab0322587d3aa785/example_text_file.txt"

r = requests.get(url, stream=True)
it = r.iter_content(chunk_size=4)
s = ResponseStream(it)
print(s.read(2))
print(s.read(2))
s.seek(-3)
print(s.read(2))

This won’t work but a range request could set it up

After tinkering, I don’t know if range requests would actually be better than the approach already in use here, but need to get negative seek values and maybe comment this on original

lmmx commented 3 years ago

After trying some more, negative seek from end doesn’t work

from stream_response import ResponseStream
import requests

url = "https://raw.githubusercontent.com/lmmx/range-streams/master/example_text_file.txt"
url2 = "https://httpbin.org/stream/20" # gives `"Transfer-Encoding": "chunked"` in headers

with open("example_text_file.txt", "rb") as f:
  b = f.read()

import io
from io import SEEK_SET, SEEK_END
i = io.BytesIO(b)
i.seek(0, SEEK_END)
endpos = i.tell()
print(f"{endpos=}")

i.seek(-4, 2)
print(i.read(2))

r = requests.get(url, stream=True)
it = r.iter_content(chunk_size=4)
s = ResponseStream(it)
print(s.read(2))
s.seek(-3, 2)
print(s.read(2))

r = requests.get(url, stream=True)
it = r.iter_content(chunk_size=4)
s = ResponseStream(it)
s.seek(0,2)
print(s.tell())

r = requests.get(url, stream=True)
it = r.iter_content(chunk_size=4)
s = ResponseStream(it)
s.seek(-4,2)
print(s.read(2))

endpos=11
b'\x06\x07'
b'P\x00'
b''
11
b''
lmmx commented 3 years ago

After reviewing what ResponseStream can do, I am encouraged for this idea of a range requests based alternative, specifically for nonlinear seeking (I think the existing one can only go forward and so seeking to the end loads all, becoming pointless)

lmmx commented 3 years ago

Correction to the seek method can make negative seek work: the else block should run after the if condition, so it shouldn’t be an else at all (just a regular statement that runs either way i.e. dedent)

I don’t think this’ll be useful and also won’t be ‘streaming’ at all unless the server supports transfer encoding = chunked in the headers, whereas range requests seem better supported and equally powerful

lmmx commented 3 years ago

Left a comment to fix https://gist.github.com/obskyr/b9d4b4223e7eaf4eedcd9defabb34f13#gistcomment-3799918

lmmx commented 3 years ago

The above sample code now works as expected:

endpos=11
b'\x06\x07'
b'P\x00'
b'\x07\x08'
11
b'\x06\x07'
lmmx commented 3 years ago

To clarify: the code works (to give negative seek and subsequent read operations) but the request is not actually streaming in chunks, and all bytes are loaded into memory (not just the desired parts specified by seeking). To achieve this when the server does not support "Transfer-Encoding": "chunked" in its headers will require range requests