Open Daniel-Worrall opened 4 years ago
What's your target for the body? Where do you write or process it? Maybe there's a way to better calculate percentage in there.
For the full body, writing to a file. For the partial body, a progress bar on STDOUT
I'm sorry, I don't quite follow. What's a full and a partial body? What happens to the data of a "partial body", why would you only show some progress for it?
Sorry. By "full", I mean fully downloaded. When the pipe has streamed all the file into memory. Then I write it to a file.
By partial, I mean mid-stream. As it is collecting parts of the body.
My case is I am pulling very large files, 100s of MB, maybe GBs, so there is a bottleneck on pulling those files and I'd like to give feedback on download status.
Mmh, I think this should be solved with something like a yielding overload of IO.copy
, something like File.open(target) {|file| IO.copy(response.body_io, file) {|pos| report_progress(pos, response.content_length) } }
. Could be a shard as well.
Didn't this idea surface somewhere else recently? Does somebody remember?
If you want to track downloaded progress you can do something this:
require "http/client"
url = "http://example.com"
HTTP::Client.get(url) do |response|
length = response.headers.get("content-length").first.to_i64
downloaded = 0_i64
body = String.build(length) do |str|
while gets = response.body_io.gets(chomp: false)
str << gets
downloaded += gets.bytesize
remaining = length - downloaded
percentage = ((length - remaining).fdiv length) * 100
end
end
HTTP::Client::Response.new(response.status, body, response.headers, nil, response.version, nil)
end
Instead of gets
you can also use read(Slice)
to read a fixed amount from a slice so you get consistent progress, but that's not needed.
Is the summary of this issue "How do I track downloaded progress?". Then the above is the answer.
Also, there's really no need to open two issues for this. And the forums are probably a better place for these questions... there's nothing buggy or wrong with the standard library.
I'm already tracking download progress in a roundabout way. The issue is more geared towards that I feel there should be a feature in Client to handle yielding to a block as the body is received, and my use case is simply how I came across it. Streaming is supplied and supported, but it doesn't feel like a complete solution.
E: The two issues are distinct. They are feature requests, and this is the platform I thought would be appropriate for them. Let me know if I'm mistaken.
I disagree. The solution above, or something similar to IO.copy, is just 3~5 lines. You may want to use different buffer sizes. You may want different things. This is not something for the standard library. You can already do what you want by composing existing APIs.
Didn't this idea surface somewhere else recently? Does somebody remember?
yeah jhass, this was discussed on gitter iirc.
Also, inspired by this answer for Go, you can do it for any IO like this (note that this is for reading files, but you can create Progress.new(response.body_io)
in your use case):
class Progress < IO
def initialize(@io : IO, @total : Int64, &@block : Int64, Int64 ->)
@current = 0_i64
end
def read(slice : Bytes)
read_bytesize = @io.read(slice)
@current += read_bytesize
@block.call(@current, @total)
read_bytesize
end
def write(slice : Bytes) : Nil
raise "Can't write"
end
end
filename = "some filename"
total = File.size(filename).to_i64
File.open(filename) do |file|
progress = Progress.new(file, total) do |current, total|
puts "Downloaded #{current} of #{total}"
end
progress.gets_to_end
end
Maybe IO::Progress
can be part of the standard library, I don't know. But as you can see, it's really easy to compose existing APIs to achieve what you want.
Client streaming by providing a block is difficult to work with since regular body consuming calls gets_to_end. I wished to get a percentage received during a request for progress, and this is what I ended up with.
There's also the issue that the body cannot be set if you have a Response instance without making a new one
HTTP::Client::Response has
#consume_body_io
, so I thought it may be possible to provide a similar method that takes a block. This below works in my use-case, but I'm unfamiliar with if this would be appropriate.