UMM-CSci-Systems / Command-line-introduction

An introduction to Unix command line tools
MIT License
0 stars 14 forks source link

Command line introduction and shell scripting

Compile tests Clean tests Shellcheck (These will be marked as "Failing" until you have all the tests passing.)

Overview

First and Foremost: Please read the entire lab before starting. It's quite possible that some questions that arise while reading the lab are answered later in the lab. That said... let's get to to it!

This provides a basic introduction to shell programming. If you use Linux much at all, you'll at least occasionally find yourself needing to use the shell/command line (i.e., what you get when you open the terminal program). Having experience with the shell is extremely useful, as you often end up needing to, e.g., ssh into a remote, cloud-based system where you won't have access to the nice GUI tools. This lab provides an introduction to a variety of important shell tools and how programming/scripting is done using shell commands.

:warning: Remember to complete the Command line introduction pre-lab reading and preparation before this lab begins.

Introduction

You will need to write a few different scripts for this lab:

None of these will be very long, but most or all of them will require you to learn new shell commands or tools. We’ll give you hints and pointers as to what commands/tools to be using, but you’ll need to do some digging in the man pages and/or searching on-line to find the details. Don’t bang your head against any piece of this for too long. If you’ve spent more than 10 minutes on a single part or command, you probably need to take a break and ask someone (like your instructor) for some help. On the other side of the coin, however, don’t immediately give up and ask at every step. Learning how to find and use this sort of information is an enormously valuable skill, and will be useful far longer and more often than the details that you’re actually looking up. So make a bit of an effort, but know when to stop.

This is all structured around the bash shell and "standard" command-line tools like find and grep. There are a variety of other shells (e.g., fish or zsh) and tools (e.g., fd and rg), but those won't necessarily be available on the the random system you find yourself working on, so we're sticking to the "old standbys".

Setting up

Before you start writing scripts, you’ll need get a copy of this repository to work on. This is a two step process:

:warning: IMPORTANT :warning: Because of a bug in GitHub's handling of templates, your copy of the repository will not work as it is. You'll need to run the following two commands to import the submodule files correctly:

git submodule init

git submodule update
Why do I need to do this? [The `bats-core` version of the Bats testing framework](https://github.com/bats-core/bats-core) uses [`git` _submodules_](https://git-scm.com/book/en/v2/Git-Tools-Submodules) to include its dependencies ([libraries like `bats-file`](https://github.com/bats-core/bats-file)) in your project. A `git` submodule is essentially an entire `git` repository nested inside an "outer" `git` repository. In this case your project is the outer, containing repository, and things like `bats-file` are the inner, contained repository. There's a bug in GitHub's new-ish template mechanism (which is what GitHub classroom uses to create copies of repositories for students and groups) that essentially loses the information that a project has submodule dependencies. So until that's fixed you'll need to add those three submodules to every lab that uses Bats.


If you're working in pairs or larger groups only one of you needs to create your group's copy in GitHub Classroom, but everyone else will need to join that team in GitHub Classroom so they have access to their team's project. Also note that if Pat checks out the project on the first day of lab, and then later Chris is logged in when you sit down to work on it again, Chris will need to check out the project. Similarly, if Pat and Chris are working on different computers, then both of them will need to clone a local copy. It's also crucial that everyone commit and push their work (perhaps to a branch) at the end of each work session so that it will accessible to everyone in the team.

You’ll “turn in” your work simply by having it committed to the repository and pushed to GitHub. We’ll check it out from there to run and grade it. We'll obviously need to be able to find your repository to grade it, so make sure to submit the URL of your repository using whatever technique is indicated by your instructor.

Be certain to commit often, and frequently trade places as driver and navigator. At a minimum you should probably trade every time you solve a specific problem that comes out of the test script. You should probably consider committing that often as well.

Write clean code

Part of this lab's rubric is readability, and shell scripts are notoriously difficult to read. So remember all the nice habits that you've learned, like using good variable names and commenting non-obvious commands. It's worth noting that almost everything in a shell script is non-obvious when you first see it. While we definitely would not recommend commenting every line of a program in a language like Java or Python, commenting every (or nearly every) line in a shell script isn't a bad idea. Students almost never comment too much on these, so when in doubt comment more rather than less.

You should make sure you run the shellcheck command on your shell scripts, e.g.,

    /snap/bin/shellcheck big_clean.sh

and heed (or at least ask questions about) any warnings that it generates.

Some resources on shell style:


Exercises

You should complete the following exercises for this lab, each of which has tests and relevant files in the indicated sub-directory:

The tests and any relevant files for each part are in the appropriate sub-directory in this repository:

In each case there are tests written using the bats testing tool for bash scripts in a file called bats_tests.sh. You should be able to run the tests with bats bats_tests.sh, and use the testing results as a guide to the development of your scripts. If you ever find that you don't understand what the tests are "telling you", definitely ask; they are there to help you, and if they aren't communicating effectively then they're not doing their job.

You should get all the tests to pass before you "turn in" your work. Having the tests pass doesn't guarantee that your scripts are 100% correct, but it's a strong initial indicator.

First script: Compiling a C program

The tests and data for this problem are in the compiling directory of this project, and the discussion of this problem will all assume that you've cded into that directory. Your goal is complete the desired script. This includes at a minimum getting the tests in bats_tests.sh (in the compiling directory) to pass.

For this you should write a bash script called extract_and_compile.sh that:

As an example, imagine you are in the directory /home/chris/lab0 and your script is called using:

./extract_and_compile.sh 17

Then it should

:exclamation: When you run the program you compiled (NthPrime) you need give NthPrime a single command line argument. The value you should pass it is the number your script received as its command line argument.

The final file structure in the example above (as displayed by the tree program) should be:

$ tree .
.
└── NthPrime
    ├── NthPrime
    ├── main.c
    ├── nth_prime.c
    └── nth_prime.h

1 directory, 4 files

Please note that if you are using bash on windows (say, perhaps, via the Linux for Windows Subsystem) then you may need to install gcc and/or bats. Talk to your instructor about how to do this if the situation arises (the exact installation command depends upon which distribution of linux is installed)

Let the tests drive your solution

Let the provided Bats tests drive your solution. Run bats bats_tests.sh and look at the first failure. What's the simplest thing you can do to get that test to pass? Look at the source for the tests (in bats_tests.sh) for hints on what to do if a given test fails.

Some non-obvious assumptions that the compiling test script makes

The tests require that the .tgz version of the tar archive will still be in the specified directory when you’re done. This means that if you first gunzip and then, in a separate step, untar, the test is likely to fail since you’ll end up with a .tar file instead of a .tgz file. So you should use the appropriate tar flags that uncompress and untar in a single step.

The tests also assume that your script generates no "extraneous" output. If, for example, you use the -v flag with tar, you'll generate a bunch of output that will cause some of the tests to fail. You may want to have "extra" output as a debugging tool while you're working on the script, but you'll need to remove all that to get the tests to pass. This is consistent with standard practice in Unix shell programming, where most commands provide little to no output if things went fine, making it much easier for you to chain them together into more complex behaviors.

Running your script by hand

Remember that you can call your script "by hand" as a debugging aid so you can see exactly what it's doing and where. So you could do something like

./extract_and_compile.sh 8

and then go look around and see what your script did. You'll want to clean up after each test like that (e.g., rm -rf NthPrime) to make sure that your script successfully re-creates NthPrime and doesn't "succeed" just because NthPrime was left over from an earlier run. If you lose NthPrime.tgz the command git restore NthPrime.tgz will bring it back, assuming you having committed the deletion.

Some notes on compiling a C program

The C compiler in the lab is the Gnu C Compiler: gcc.

There are two .c files in this program, both of which will need to be compiled and linked for form an executable. You can do this in a single line (handing gcc both .c files) or you can compile them separately and then link them.

You can tell gcc what you want the executable called, or you can take the default output and rename it.

Most of you have never compiled a C program before, so this might be a good time to ask me to say a little about how that works. Alternately, you might see what you can figure out with man gcc.


Second script: Clean up a big directory

Your goal here to build a script that removes files that have been marked for deletion; at a minimum, you want to get the tests in bats_tests.sh to pass and make sure shellcheck is happy. For this you should write a bash script named big_clean.sh that:

If we assume that your scratch directory is, for example, /tmp/tmp.eMvVweqb, then after the first step (uncompressing) the sample tar file little_dir.tgz you should end up with:

$ tree /tmp/tmp.eMvVweqb/
/tmp/tmp.eMvVweqb/
└── little_dir
    ├── file_0
    ├── file_1
    ├── file_10
    ├── file_11
    ├── file_12
    ├── file_13
    ├── file_14
    ├── file_15
    ├── file_16
    ├── file_17
    ├── file_18
    ├── file_19
    ├── file_2
    ├── file_3
    ├── file_4
    ├── file_5
    ├── file_6
    ├── file_7
    ├── file_8
    └── file_9

1 directory, 20 files

Then after deleting the appropriate files, you should have:

$ tree /tmp/tmp.eMvVweqb/
/tmp/tmp.eMvVweqb/
└── little_dir
    ├── file_1
    ├── file_10
    ├── file_11
    ├── file_12
    ├── file_15
    ├── file_16
    ├── file_17
    ├── file_18
    ├── file_19
    ├── file_2
    ├── file_3
    ├── file_4
    ├── file_5
    ├── file_6
    ├── file_8
    └── file_9

1 directory, 16 files

Finally, after creating the new cleaned tar file (cleaned_little_dir.tgz in this case) your project directory should look like:

$ tree cleaning/
cleaning/
├── bats_tests.sh
├── big_dir.tgz
├── cleaned_little_dir.tgz
└── little_dir.tgz

0 directories, 4 files

Some non-obvious assumptions that the cleaning test script makes

The tests assume that the .tgz version of the tar archive will be in the specified directory when you’re done. This means that if you first gunzip and then, in a separate step, untar, the test is likely to fail since you’ll end up with a .tar file instead of a .tgz file. So you should use the appropriate tar flags that uncompress and untar in a single step.

You should also assume that if you untar frogs.tgz that will result in a directory called frogs that contains the files you need to process. (There's nothing magic about tar that requires this to be true – see "Notes on archive structures" for more).

You can assume that the first argument has the form frogs.tgz and not some alternative like frogs.tar.gz.

The tests assume that your script generates no "extraneous" output. If, for example, you use the -v flag with tar, you'll generate a bunch of output that will cause some of the tests to fail. You may want to have "extra" output as a debugging tool while you're working on the script, but you'll need to remove all that to get the tests to pass. This is consistent with standard practice in Unix shell programming, where most commands provide little to no output if things went fine, making it much easier for you to chain them together into more complex behaviors.

Final Thoughts

Make sure that all your code passes the appropriate tests. Passing the test will make up the majority of your grade. There will also be a portion of your grade that will take into account how clean your code is. Also when we said that you should commit often –- we meant it. Also be professional and informative about your commit messages; we'll be looking at them in the grading. Finally, it is easy to overlook important details. If the test isn’t being passed go back and re-read the directions carefully.

What to turn in

You'll "turn this in" by committing your work to your GitHub Classroom copy of the project. You should also submit the URL of your repository in whatever way indicated by your instructor. Remember to make sure you've completed each of the assigned tasks: