As beautiful as a shell 🐚
This project is all about recreating your very own (mini)shell, taking bash (Bourne Again SHell) as reference. This was our first group project, and I was honored to do it with @mbueno-g :)
As we just said, we are asked to implement our own shell, but what is a shell to begin with? If we think of (for example) Linux as a nut or a seashell, the kernel/seed is the core of the nut and has to be surrounded by a cover or shell. Likewise, the shell we are implementing works as a command interpreter communicating with the OS kernel in a secure way, and allows us to perform a number tasks from a command line, namely execute commands, create or delete files or directories, or read and write content of files, among (many) other things
The general idea for this shell is reading a string of commands in a prompt using readline. Before anything, it is highly recommended to take a deep dive into the bash manual, as it goes over every detail we had to have in mind when doing this project. Minishell
involves heavy parsing of the string read by readline
, thus it is crucial to divide the code of the project into different parts: the lexer
, the expander
, the parser
, and lastly the executor
This first part covers the part of our code in charge of expanding environment variables with $
followed by characters, as well as the expansion of ~
to the user's home directory. Here we also split the input string into small chunks or tokens to better handle pipes, redirections, and expansions.
After reading from the stdin
we use a function we named cmdtrim
which separates the string taking spaces and quotes into account. For example:
string: echo "hello there" how are 'you 'doing? $USER |wc -l >outfile
output: {echo, "hello there", how, are, 'you 'doing?, $USER, |wc, -l, >outfile, NULL}
Then, we apply the expander functions on top of every substring of the original string, resulting in something similar to this:
output: {echo, "hello there", how, are, 'you 'doing?, pixel, |wc, -l, >outfile, NULL}
Note: if a variable is not found, the $var part of the string will be replaced by an empty string
Lastly, we have another split function called cmdsubsplit
which separates with <
, |
, or >
, but only if those chars are outside of quotes:
output: {echo, "hello there", how, are, 'you 'doing?, pixel, |, wc, -l, >, outfile, NULL}
The parser is in charge of storing the tokenized string and save it in a useful manner for the executor to use later. Our data structure is managed as follows:
int g_status;
typedef struct s_prompt
{
t_list *cmds;
char **envp;
pid_t pid;
} t_prompt;
typedef struct s_mini
{
char **full_cmd;
char *full_path;
int infile;
int outfile;
} t_mini;
Here is a short summary of what every variable is used for
Parameter | Description |
---|---|
cmds | Linked list containing a t_mini node with all commands separated by pipes |
|
full_cmd | Equivalent of the typical argv , containing the command name and its parameters when needed |
|
full_path | If not a builtin, first available path for the executable denoted by argv[0] from the PATH variable |
|
infile | Which file descriptor to read from when running a command (defaults to stdin ) |
|
outfile | Which file descriptor to write to when running a command (defaults to stdout ) |
|
envp |
Up-to-date array containing keys and values for the shell environment |
pid |
Process ID of the minishell instance |
g_status |
Exit status of the most-recently-executed command |
After running our lexer and expander, we have a two-dimensional array. Following the previous example, it was the following:
{echo, "hello there", how, are, 'you 'doing?, pixel, |, wc, -l, >, outfile, NULL}
Now, our parser starts building the linked list of commands (t_list *cmds
), which is filled in the following way:
argv
) we call full_cmd
Here's how the variables will look like according to the example we used before:
cmds:
cmd 1:
infile: 0 (default)
outfile: 1 (redirected to pipe)
full_path: NULL (because echo is a builtin)
full_cmd: {echo, hello there, how, are, you doing?, pixel, NULL}
cmd 2:
infile: 0 (contains output of previous command)
outfile: 3 (fd corresponding to the open file 'outfile')
full_path: /bin/wc
full_cmd: {wc, -l, NULL}
envp: (envp from main)
pid: process ID of current instance
g_status: 0 (if last command exits normally)
With all our data properly on our structs, the executer
has all the necessary information to execute commands. For this part we use separate processess to execute either our builtins or other commands inside child processes that redirect stdin
and stdout
in the same way we did with our previous pipex project. If we are given a full path (e.g. /bin/ls
) then we do not need to look for the full path of the command and can execute directly with execve. If we are given a relative path then we use the PATH
environment variable to determine the full_path
of a command. After all commands have started running, we retrieve the exit status of the most recently executed command with the help of waitpid
Once all commands have finished running the allocated memory is freed and a new prompt appears to read the next command
Here is a handy mindmap of our code structure to help you understand everything we mentioned previously
For this project we could use one global variable. At first it seemed we were never going to need one, but later it became obvious that it is required. Specifically, it has to do with signals. When you use signal to capture SIGINT
(from Ctrl-C
) and SIGQUIT
(from Ctrl-\
) signals, we have to change the error status, and the signal
function has no obvious way of retrieving the updated exit status that shoud change when either of these signals are captured. To work this around, we added a global variable g_status
that updates the error status when signals are detected.
We were asked to implement some basic builtins with the help of some functions, here is a brief overview of them:
Builtin | Description | Options | Parameters | Helpful Functions |
---|---|---|---|---|
echo | Prints arguments separated with a space followed by a new line | -n |
:heavy_check_mark: | write | ||
cd | Changes current working directory, updating PWD and OLDPWD |
:x: | :heavy_check_mark: | chdir | |
pwd |
Prints current working directory | :x: | :x: | getcwd |
env |
Prints environment | :x: | :x: | write |
export |
Adds/replaces variable in environment | :x: | :heavy_check_mark: | :x: |
unset |
Removes variable from environment | :x: | :heavy_check_mark: | :x: |
As mentioned previously, we use readline
to read the string containing the shell commands. To make it more interactive, readline
receives a string to be used as a prompt. We have heavily tweaked the looks of it to be nice to use. The prompt is structured as follows:
$USER@minishell $PWD $
Some remarks:
guest
PWD
is colored blue and dynamically replaces the HOME
variable with ~
when the variable is set. See below for more details$
in the end is printed blue or red depending on the exit status in the struct
These are a few neat extras that were not explicitly mentioned on the subject of the project but we thought would make the whole experience nicer
The $USER@minishell
part of the prompt is available in six different colors (based on the first char of the user's username):
Note: red
color is reserved for the root
user
We were told to only expand variables of the form $ + alphanumeric chars
. We implemented expansion of $$
, which expands to the program's process id (mini_getpid()
)
When running new instances of minishell or minishell withouth environment (env -i ./minishell
), some environment variables need to be updated manualy, namely the shell level (SHLVL
) or the _
variable
Here's the env when minishell is launched without an environment: