jovanbulck / jsh

A basic UNIX shell implementation in C
GNU General Public License v3.0
30 stars 10 forks source link

Piping built_ins may fill up a pipe buffer #52

Open jovanbulck opened 9 years ago

jovanbulck commented 9 years ago

This is a nasty bug that took me some time ;-)

The problem is the OS limits the size of a pipe buffer and jsh doesn't start a built_in-producer in parallel with its consumer-follower in the pipeline (since jsh is mono-threaded).

context: jsh pipeline implementation

To clarify: let's take an easy non-built_in example cat some_file | grep some_string. This is more or less what happens:

The point is jsh doesn't block after one process in the pipeline is started (forked). Instead it directly creates the next one. This implies the consumer process (eg grep) can start consuming data from the pipe filled by the producer (eg cat). The OS limits the size of a pipe buffer, refusing any more data when the pipe buffer is full. The rationale here is that the consumer should consume some data, freeing up space, before the producer can continue producing.

problem: built_ins in pipelines

Now the problem: jsh deals with built_ins in a pipeline different then with normal "forkable" commands. For example consider history | grep some_string

Of course this is the problem: a built_in will write all of its data to the pipe before the consumer process is started (since jsh executes in a single thread). When the history file is very large, the pipe buffer fills up... (e.g. + 60.000 bytes of history; +5000 lines)

possible solutions

Some solution sketches:

Comments? Feedback? Ideas?

If anyone feels like it, claim! This is a nice bugfix for getting to know the jhs internals :-)