Closed jrochkind closed 7 months ago
Hi Jonathan,
Thanks for submitting this issue.
However, looking at that code and seeing it's writing the stream to a tempfile on disk.... may defeat the purpose of what I'm trying to do anyway. I was trying to improve performance by avoiding the write to disk, and streaming the bytes from network directly to the process. (Which is not cat in my real use case of course, that was just a simple reproduction).
But if TTY::Command is going to write whatever in: stream I give it to disk in a temporary file anyway, I'm not going to get the performance increase I was hoping for. So this may still be a bug that should be fixed, but maybe won't actually help me out.
That's not how tty-command
works. I believe you have jumped to the wrong conclusion based on spurious understanding.
The tty-command
is a wrapper around Process.spawn
. It doesn't aim to reinvent the wheel and instead piggybacks on the spawn
method and its options. This is key. The spawn
method provides the following redirection options and allowed values:
redirection:
key:
FD : single file descriptor in child process
[FD, FD, ...] : multiple file descriptor in child process
value:
FD : redirect to the file descriptor in parent process
string : redirect to file with open(string, "r" or "w")
[string] : redirect to file with open(string, File::RDONLY)
[string, open_mode] : redirect to file with open(string, open_mode, 0644)
[string, open_mode, perm] : redirect to file with open(string, open_mode, perm)
[:child, FD] : redirect to the redirected file descriptor
:close : close the file descriptor in child process
Further, the file descriptor can be given as:
:in : the file descriptor 0 which is the standard input
:out : the file descriptor 1 which is the standard output
:err : the file descriptor 2 which is the standard error
integer : the file descriptor of specified the integer
io : the file descriptor specified as io.fileno
The tty-command
examines these redirect options. In almost all cases, the TTY::Command
passes these through to spawn
unchanged which is the first line in the convert_to_fd
method:
return object if fd?(object)
However, when the object provided is a string it intercepts it and first checks if it is a file, and if so lets it through:
if object.is_a?(::String) && ::File.exist?(object)
return object
end
Finally, in an exceptional circumstance when the redirect is a string that doesn't reference a file, it converts it to a Tempfile
so that it can conform to the spawn
interface. It then can be passed to spawn
, for example, as a value for :in
option. Why have I made this exception? I wanted to be able to pass a keyboard input as a string like "input\ninput\ninput\n"
for the benefits of tools like tty-prompt
.
Now back to your original issue of providing Down::ChunkedIO
object as a value for :in
option. As shown above, the spawn
method can accept FD(:in, :out, :err, integer, IO)
,string
and :close
as values for :in
option. However, your expectation is for :in
to accept what is essentially a random object called IO but not an IO for all intents and purposes and stream the content down to process. Trying this out with spawn
:
down_io = Down.open("https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg")
Process.spawn("cat > out.jpg", in: down_io)
You get an error:
`spawn': wrong exec redirect action (ArgumentError)
Therefore, I don't believe this to be a bug. If anything, the TTY::Command
should raise an error for any value that doesn't conform to the spawn
interface. The Down::ChunkedIO
object certainly doesn't conform to any of the expected values.
As for the encoding change in the Tempfile
, this will be useful in cases when the input string contains ASCII 8-bit characters. However, the file should probably be written as binary to avoid any encoding issues.
Not sure what the best solution for your case is. Ideally, TTY::Command
instance would expose the child process streams and make them available to down_io
to write in chunks. However, this is not a feature that this gem offers yet.
Thank you that was very helpful!
However, your expectation is for :in to accept what is essentially a random object called IO but not an IO for all intents and purposes and stream the content down to process
The basic problem is that ruby provides no actual specification of what an "IO" is! Down::ChunkedIO is trying it's best to provide something that is an IO for all intents and purposes -- and it does succesfully serve as an IO in many contexts -- but ruby just doesn't make it easy (or possibly even feasible at all) for anything other than the stdlib objects to be IO "for all intents and purposes". :(
ruby: 3.2.2 TTY::Command: 0.10.1
I am trying to stream a
Down::ChunkedIO
object from down to a TTY::Command.Where the content could be arbitrary binary content. But binary content streamed
in:
raises aEncoding::UndefinedConversionError
A simple reproduction:
But this raises
Encoding::UndefinedConversionError
I expect this to instead properly stream to process, in this case (for reproduction purposes only) resulting in writing output to
out.jpg
.I think I can fix the problem by changing this line:
https://github.com/piotrmurach/tty-command/blob/6aeb499dc323aebca7450f95f3fc8360a0d1ee83/lib/tty/command/child_process.rb#L198
To:
If I make that change, the reproduction script no longer raises, and has the outcome I expect -- writing the bytes to local
out.jpg
.I'm not sure if that change might break anything else, or if this is simply a bug that can be fixed?
But wait, streaming
in:
writes a a tempfile?However, looking at that code and seeing it's writing the stream to a tempfile on disk.... may defeat the purpose of what I'm trying to do anyway. I was trying to improve performance by avoiding the write to disk, and streaming the bytes from network directly to the process. (Which is not
cat
in my real use case of course, that was just a simple reproduction).But if
TTY::Command
is going to write whateverin:
stream I give it to disk in a temporary file anyway, I'm not going to get the performance increase I was hoping for. So this may still be a bug that should be fixed, but maybe won't actually help me out.Is is not feasible to stream the bytes directly to the process without creating the intermediate temp file? In my real use case, these are very large files from the network, and I was hoping to stream them directly to a spawned process stdin without first buffering them all on disk and taking the performance hit of the write-to-disk then read-from-disk.