For anyone who might benifit
╗ ~ test-prep : (master) tree
.
├── BG-Final-Study_Guide
│ ├── Code-Snippets
│ │ ├── graph-class.js
│ │ ├── graph.js
│ │ ├── graphs.md
│ │ ├── implement-graph.js
│ │ ├── sept-study-guide-testing.js
│ │ └── sept-study-guide.js
│ └── images
│ ├── 01_1280px-Sorted_binary_tree_breadth-first_traversal.svg.png
│ ├── 01a_graphs.png
│ ├── 01b_adj_matrix_graph.png
│ ├── 01b_graph_a.png
│ ├── 01c_Sorted_binary_tree_ALL.png
│ ├── 02a_bsts.png
│ ├── 02a_graph.png
│ ├── 02b_forest.png
│ ├── 02b_not_bst.png
│ ├── 02c_good_bad_bst.png
│ ├── 02d_Sorted_binary_tree_ALL.png
│ ├── 1604180981115.png
│ ├── BFS.png
│ ├── Binary-search-tree-insertion-When-a-sequence-of-data-f1-3-4-6-5-7-9-8-2-g.png
│ ├── DOM-is-nary-tree.png
│ ├── Headers-Associated-Layers.png
│ ├── IPv4-header.png
│ ├── IPv6-header.png
│ ├── Routers-Switches-Hubs-OSI-Layers.png
│ ├── adj_matrix_graph.png
│ ├── arr1.png
│ ├── array.png
│ ├── basic-graph.png
│ ├── binary tree.png
│ ├── binary-search-tree.png
│ ├── binary-tree.png
│ ├── breadth-first.png
│ ├── bst-constructor.png
│ ├── bsts.png
│ ├── children-nodes.png
│ ├── code.png
│ ├── complete-graph.png
│ ├── complete-or-connected-graph.png
│ ├── console.trace.png
│ ├── console.trace1.png
│ ├── convert-binary.png
│ ├── convert-hexidecimal.png
│ ├── cycle-example-2.png
│ ├── cycle-example.png
│ ├── cycle.png
│ ├── delete-edge.png
│ ├── dense-graph.png
│ ├── dfs.png
│ ├── directed-graph.png
│ ├── directed-or-undirected-cycles.png
│ ├── directory structure is nary tree.png
│ ├── doubly-linked-list-removeFromTail.png
│ ├── dropped-segment.png
│ ├── encapsulation.png
│ ├── find-edge.png
│ ├── forest.png
│ ├── full-cycle.png
│ ├── good_bad_bst.png
│ ├── graph-edge.png
│ ├── graph-md.png
│ ├── graph.png
│ ├── graph1.png
│ ├── graph2.png
│ ├── graph_a.png
│ ├── graphs.png
│ ├── i.png
│ ├── image-ip-dns-domain.png
│ ├── image-ip-dns-resolution-chat.png
│ ├── image-ip-ipv4-headers.png
│ ├── image-ip-ipv6-headers.png
│ ├── image-ip-networking-hub.png
│ ├── image-ip-networking-router.png
│ ├── image-ip-networking-switch.png
│ ├── image-ip-tcp-dropped-segment.png
│ ├── image-ip-tcp-four-way-closing.png
│ ├── image-ip-tcp-header-control-flags.png
│ ├── image-ip-tcp-header-fields.png
│ ├── image-ip-tcp-seq-ack.png
│ ├── image-ip-tcp-slow-connection.png
│ ├── image-ip-tcp-socket-states.png
│ ├── image-network-models-osi.png
│ ├── image-network-models-tcp-ip.png
│ ├── image.png
│ ├── implementation.png
│ ├── in-order-traversal.png
│ ├── insert.png
│ ├── iterative search.png
│ ├── leaf-nodes.png
│ ├── leaves-depth-height.png
│ ├── linked-list-graph.png
│ ├── linked-list.png
│ ├── min-max-nodes-balanced.png
│ ├── min-max-nodes-ll.png
│ ├── n-ary-tree.png
│ ├── non-isolated-cycle.png
│ ├── not_bst.png
│ ├── number-tree.png
│ ├── old-proj.png
│ ├── osi-network-layers.png
│ ├── pasted image 0.png
│ ├── path-sum.png
│ ├── path.png
│ ├── post-order-depth-first.png
│ ├── post-order-traversal.png
│ ├── post-order.png
│ ├── pre-and-in-order-traversal.png
│ ├── pre-order-traversal.png
│ ├── proj2a_wireshark-01.png
│ ├── proj2b_wireshark-02.png
│ ├── proj2c_wireshark-03.png
│ ├── proj2d_wireshark-04.png
│ ├── proj2e_wireshark-05.png
│ ├── proj2f_wireshark-06.png
│ ├── queue.png
│ ├── reverse.png
│ ├── search-iterative.png
│ ├── search-recursively.png
│ ├── siblings.png
│ ├── sparse-graph.png
│ ├── stack.png
│ ├── subtree.png
│ ├── tcp-ip-model.png
│ ├── tcpip.png
│ ├── to-string.png
│ ├── traversals.png
│ ├── tree-Node.png
│ ├── tree-vs-graph-w-cycle.png
│ ├── tree.png
│ ├── trees.png
│ ├── undirected-graph.png
│ ├── weighted-graph.png
│ ├── weighted-or-unweighted.png
│ ├── white-board-approach-reacto.png
│ ├── wireshark-01.png
│ ├── wireshark-02.png
│ ├── wireshark-03.png
│ ├── wireshark-04.png
│ ├── wireshark-05.png
│ └── wireshark-06.png
├── Student-Made-possibly-Imperfect
│ ├── BG
│ │ ├── Data-Structures-Cheat-Sheet.md
│ │ ├── Images
│ │ │ ├── basic-graph.png
│ │ │ ├── complete-graph.png
│ │ │ ├── cycle-example-2.png
│ │ │ ├── cycle-example.png
│ │ │ ├── dense-graph.png
│ │ │ ├── directed-graph.png
│ │ │ ├── non-isolated-cycle.png
│ │ │ ├── sparse-graph.png
│ │ │ ├── undirected-graph.png
│ │ │ ├── w8_learning_objectives.md
│ │ │ └── weighted-graph.png
│ │ ├── README.md
│ │ ├── W8-filled-in-LOs.md
│ │ ├── WEEK-8-NOTES.md
│ │ ├── misc
│ │ │ ├── Data-Structures-Cheat-Sheet.md
│ │ │ ├── README.md
│ │ │ ├── W8-filled-in-LOs.md
│ │ │ ├── WEEK-8-NOTES.md
│ │ │ └── w8-summary.pdf
│ │ ├── w8-summary.pdf
│ │ └── week-8-quiz
│ │ ├── quiz-style.css
│ │ ├── quiz.js
│ │ ├── wk8-images
│ │ │ ├── Headers-Associated-Layers.png
│ │ │ ├── IPv4-header.png
│ │ │ ├── IPv6-header.png
│ │ │ ├── Routers-Switches-Hubs-OSI-Layers.png
│ │ │ ├── binary-search-tree.png
│ │ │ ├── in-order-traversal.png
│ │ │ ├── linked-list-graph.png
│ │ │ ├── osi-network-layers.png
│ │ │ ├── post-order-traversal.png
│ │ │ ├── pre-order-traversal.png
│ │ │ ├── tcp-ip-model.png
│ │ │ └── trees.png
│ │ ├── wk8-quiz.html
│ │ └── wk8-quiz.md
│ ├── Quizlet.pdf
│ ├── binary-tree-reading.md
│ ├── graphs
│ │ └── Images
│ │ ├── basic-graph.png
│ │ ├── complete-graph.png
│ │ ├── cycle-example-2.png
│ │ ├── cycle-example.png
│ │ ├── dense-graph.png
│ │ ├── directed-graph.png
│ │ ├── graphs.md
│ │ ├── non-isolated-cycle.png
│ │ ├── sparse-graph.png
│ │ ├── undirected-graph.png
│ │ └── weighted-graph.png
│ ├── graphs-bst-data-structure-methods-main
│ │ ├── binary-tree-reading (1).md
│ │ ├── binary-trees.js
│ │ └── graph-methods.js
│ ├── networks.md
│ └── readme-and-images
│ ├── README.md
│ ├── delete-edge.png
│ ├── find-edge.png
│ ├── graph-edge.png
│ ├── graph1.png
│ ├── graph2.png
│ ├── reverse.png
│ └── to-string.png
├── Study-Guides
│ ├── W08-study-guide
│ │ ├── W08-study-guide
│ │ │ ├── Binary Trees.apkg
│ │ │ ├── Networking Day Cards
│ │ │ │ ├── Network DNS.apkg
│ │ │ │ ├── Network OSI.apkg
│ │ │ │ └── Network TCP_IP.apkg
│ │ │ ├── W08-LOs-empty.md
│ │ │ ├── W08-LOs-explained.md
│ │ │ └── public
│ │ │ ├── dropped-segment.svg
│ │ │ ├── full-cycle.svg
│ │ │ ├── ipv6-headers.svg
│ │ │ ├── min-max-nodes-balanced.png
│ │ │ ├── min-max-nodes-ll.png
│ │ │ ├── number-tree.png
│ │ │ ├── osi-model.svg
│ │ │ ├── tcp-headers.svg
│ │ │ ├── tcp-ip-model.svg
│ │ │ └── three-way-handshake.svg
│ │ └── tree_order_project
│ │ ├── lib
│ │ │ ├── leet_code_105.js
│ │ │ ├── tree_node.js
│ │ │ └── tree_order.js
│ │ ├── package-lock.json
│ │ ├── package.json
│ │ └── test
│ │ └── test.js
│ ├── Week 8 Review.xlsx
│ ├── study-guide-excel.pdf
│ └── w8-study-guide.pdf
├── materials
│ ├── flash-cards
│ │ ├── Binary-Trees.md
│ │ ├── Network OSI.md
│ │ ├── network-Dns.md
│ │ ├── networkTCP_Ip.md
│ │ └── tcpip.png
│ └── image
│ └── COMBIN-Data-Structures (2)
│ ├── 1604180981115.png
│ ├── Data-Structures-Concepts.md
│ ├── Network DNS.apkg
│ ├── Network OSI.apkg
│ ├── Network TCP_IP.apkg
│ ├── arr1.png
│ ├── array.png
│ ├── binary-tree.png
│ ├── dfs.png
│ ├── directed-or-undirected-cycles.png
│ ├── graph-md.png
│ ├── graph.png
│ ├── leaves-depth-height.png
│ ├── linked-list.png
│ ├── post-order.png
│ ├── pre-and-in-order-traversal.png
│ ├── queue.gif
│ ├── queue.png
│ ├── stack.gif
│ ├── stack.png
│ ├── traversals.png
│ ├── tree.png
│ └── weighted-or-unweighted.png
├── projects
│ ├── d1
│ │ ├── package-lock.json
│ │ ├── package.json
│ │ ├── problems
│ │ │ ├── 01_tree_node.js
│ │ │ ├── 02_tree_order.js
│ │ │ ├── 03_dfs.js
│ │ │ ├── 04_bfs.js
│ │ │ └── 05_leet_code_105.js
│ │ └── test
│ │ └── test.js
│ ├── d2
│ │ ├── package-lock.json
│ │ ├── package.json
│ │ ├── problems
│ │ │ ├── bst.js
│ │ │ ├── findMin.js
│ │ │ ├── getHeight.js
│ │ │ ├── leet_code_108.js
│ │ │ ├── leet_code_110.js
│ │ │ └── leet_code_450.js
│ │ └── test
│ │ └── test.js
│ ├── d3
│ │ ├── eod-graphs-intro
│ │ │ ├── README.md
│ │ │ ├── package-lock.json
│ │ │ ├── package.json
│ │ │ ├── problems
│ │ │ │ └── graph.js
│ │ │ └── test
│ │ │ └── graph-spec.js
│ │ ├── graphs-intro-solution
│ │ │ ├── package-lock.json
│ │ │ ├── package.json
│ │ │ ├── problems
│ │ │ │ └── graph.js
│ │ │ └── test
│ │ │ └── graph-spec.js
│ │ └── graphs-solution
│ │ ├── lib
│ │ │ ├── breadth_first_search.js
│ │ │ ├── friends-of.js
│ │ │ ├── graph_node.js
│ │ │ ├── leet_code_207.js
│ │ │ ├── max_value.js
│ │ │ └── num_regions.js
│ │ ├── package-lock.json
│ │ ├── package.json
│ │ └── test
│ │ ├── 01-test.js
│ │ ├── 02-friends-of-spec.js
│ │ ├── 03-graph-node-algorithms.js
│ │ └── names.js
│ ├── d5
│ │ ├── whiteboarding-problems.md
│ │ ├── whiteboarding-solutions-2.js
│ │ └── whiteboarding-solutions.js
│ └── other
│ └── eod-graphs-intro
│ ├── README.md
│ ├── package-lock.json
│ ├── package.json
│ ├── problems
│ │ └── graph.js
│ └── test
│ └── graph-spec.js
└── weeks-LOs
├── W8-filled-in-LOs.md
└── w8_learning_objectives.md
- [Week-8-Notes](#week-8-notes)
- [Basic Tree Terminology](#basic-tree-terminology)
- [Traversing trees](#traversing-trees)
- [Breadth-first search](#breadth-first-search)
- [Depth-first searches](#depth-first-searches)
- [Pre-order traversal](#pre-order-traversal)
- [In-order traversal](#in-order-traversal)
- [Post-order traversal](#post-order-traversal)
- [Binary Search Trees](#binary-search-trees)
- [BST Definition](#bst-definition)
- [A BST is a Sorted Data Structure](#a-bst-is-a-sorted-data-structure)
- [A special traversal case](#a-special-traversal-case)
- [Binary Tree Project](#binary-tree-project)
- [Instructions](#instructions)
- [Binary Search Tree Project](#binary-search-tree-project)
- [Instructions](#instructions-1)
- [WEEK-08 DAY-3<br>*Graphs*](#week-08-day-3brgraphs)
- [Graphs and Heaps](#graphs-and-heaps)
- [Graphs](#graphs)
- [What is a Graph?](#what-is-a-graph)
- [Graph Implementations](#graph-implementations)
- [GraphNode Class](#graphnode-class)
- [Adjacency Matrix](#adjacency-matrix)
- [Adjacency List](#adjacency-list)
- [Graph Traversal](#graph-traversal)
- [Graph Traversal w/ GraphNode](#graph-traversal-w-graphnode)
- [Graph Traversal w/ Adjacency List](#graph-traversal-w-adjacency-list)
- [Graph Project](#graph-project)
- [Instructions](#instructions-2)
- [Friends of](#friends-of)
- [WEEK-08 DAY-4<br>*Network Knowledge*](#week-08-day-4brnetwork-knowledge)
- [Network Models Objectives](#network-models-objectives)
- [Internet Protocol Suite Objectives](#internet-protocol-suite-objectives)
- [Network Tools](#network-tools)
- [The OSI Network Model](#the-osi-network-model)
- [More layers, ~~more~~ fewer problems?](#more-layers-smores-fewer-problems)
- [The layers of the OSI model](#the-layers-of-the-osi-model)
- [Application](#application)
- [Presentation](#presentation)
- [Session](#session)
- [Transport](#transport)
- [Network](#network)
- [Data Link](#data-link)
- [Physical](#physical)
- [Which model do I use?](#which-model-do-i-use)
- [What we've learned](#what-weve-learned)
- [TCP/IP: Four Layers](#tcpip-four-layers)
- [A layered approach](#a-layered-approach)
- [Layers of the TCP/IP model](#layers-of-the-tcpip-model)
- [Application](#application-1)
- [Transport](#transport-1)
- [Internet](#internet)
- [Link](#link)
- [Translating layers to data](#translating-layers-to-data)
- [What we've learned](#what-weve-learned-1)
- [A Crash Course in Binary and Hexadecimal Notation](#a-crash-course-in-binary-and-hexadecimal-notation)
- [Binary](#binary)
- [Bases](#bases)
- [Base 2](#base-2)
- [Bits and Bytes](#bits-and-bytes)
- [Another useful base](#another-useful-base)
- [The `0x` Notation](#the-0x-notation)
- [In JavaScript](#in-javascript)
- [In conclusion](#in-conclusion)
- [Internet Protocol](#internet-protocol)
- [History of IP](#history-of-ip)
- [The great divide](#the-great-divide)
- [So what is the Internet, exactly?](#so-what-is-the-internet-exactly)
- [Packet-Switching](#packet-switching)
- [IP Versions](#ip-versions)
- [IPv4](#ipv4)
- [IPv4 Addresses](#ipv4-addresses)
- [IPv6](#ipv6)
- [IPv6 Addresses](#ipv6-addresses)
- [Special addresses](#special-addresses)
- [What we've learned](#what-weve-learned-2)
- [Transport Protocols](#transport-protocols)
- [What exactly are we transporting?](#what-exactly-are-we-transporting)
- [Ports](#ports)
- [TCP](#tcp)
- [UDP](#udp)
- [What we've learned](#what-weve-learned-3)
- [Surveying Your Domain](#surveying-your-domain)
- [What is DNS?](#what-is-dns)
- [Domains?](#domains)
- [How The Magic Happens](#how-the-magic-happens)
- [DNS Records](#dns-records)
- [`SOA`](#soa)
- [`NS`](#ns)
- [`A` / `AAAA`](#a--aaaa)
- [`CNAME`](#cname)
- [`MX`](#mx)
- [Metadata](#metadata)
- [What we've learned](#what-weve-learned-4)
- [Networking Hardware: Getting Physical](#networking-hardware-getting-physical)
- [Three levels of control](#three-levels-of-control)
- [Hubs: keeping it simple](#hubs-keeping-it-simple)
- [Switches: traffic control](#switches-traffic-control)
- [Routers: thinking globally](#routers-thinking-globally)
- [A practical example of network hardware](#a-practical-example-of-network-hardware)
- [Integrated devices](#integrated-devices)
- [What we've learned](#what-weve-learned-5)
- [TCP Connections](#tcp-connections)
- [Segments](#segments)
- [Segment Header Fields](#segment-header-fields)
- [TCP Connection Lifecycle](#tcp-connection-lifecycle)
- [Control Flag Options](#control-flag-options)
- [Getting to know each other: the three-way handshake](#getting-to-know-each-other-the-three-way-handshake)
- [Data transmission & error handling](#data-transmission--error-handling)
- [Saying goodbye: closing the connection](#saying-goodbye-closing-the-connection)
- [The TCP Socket State Lifecycle](#the-tcp-socket-state-lifecycle)
- [What we've learned](#what-weve-learned-6)
- [Following The Trail With `traceroute`](#following-the-trail-with-traceroute)
- [Where are we going?](#where-are-we-going)
- [Reading a trace](#reading-a-trace)
- [Metadata](#metadata-1)
- [The Hop](#the-hop)
- [Special cases](#special-cases)
- [When should I run a trace?](#when-should-i-run-a-trace)
- [What we've learned](#what-weve-learned-7)
- [Use Wireshark To Capture Network Traffic](#use-wireshark-to-capture-network-traffic)
- [Installing Wireshark](#installing-wireshark)
- [Capturing packets](#capturing-packets)
- [Color coding](#color-coding)
- [Sample captures](#sample-captures)
- [Filtering packets](#filtering-packets)
- [Inspecting packets](#inspecting-packets)
- [Assignment](#assignment)
- [DIY Flashcards](#diy-flashcards)
<!-- /code_chunk_output -->
____
We're going to need a helper function that solves the first major point from above. How might we merge two sorted arrays? In other words we want a merge
function that will behave like so:
let arr1 = [1, 5, 10, 15];
let arr2 = [0, 2, 3, 7, 10];
merge(arr1, arr2); // => [0, 1, 2, 3, 5, 7, 10, 10, 15]
Merging two sorted arrays is simple. Since both arrays are sorted, we know the smallest numbers to always be at the front of the arrays. We can construct the new array by comparing the first elements of both input arrays. We remove the smaller element from it's respective array and add it to our new array. Do this until both input arrays are empty:
function merge(array1, array2) {
let merged = [];
while (array1.length || array2.length) {
let ele1 = array1.length ? array1[0] : Infinity;
let ele2 = array2.length ? array2[0] : Infinity;
let next;
if (ele1 < ele2) {
next = array1.shift();
} else {
next = array2.shift();
}
merged.push(next);
}
return merged;
}
If your JavaScript is rusty, don't freak out! Here are a few cool JS patterns that we leverage above.
0
is considered a falsey value, meaning it acts like false
when used in boolean expressions. All other numbers are truthy.Infinity
is a value that is guaranteed to be greater than any other quantityshift
is an array method that removes and returns the first elementIs the code still hazy? Let's look at an annotated version:
// commented```js
function merge(array1, array2) {
let merged = [];
// keep running while either array still contains elements
while (array1.length || array2.length) {
// if array1 is nonempty, take its the first element as ele1
// otherwise array1 is empty, so take Infinity as ele1
let ele1 = array1.length ? array1[0] : Infinity;
// do the same for array2, ele2
let ele2 = array2.length ? array2[0] : Infinity;
let next;
// remove the smaller of the eles from it's array
if (ele1 < ele2) {
next = array1.shift();
} else {
next = array2.shift();
}
// and add that ele to the new array
merged.push(next);
}
return merged;
}
By using Infinity
as the default ele when an array is empty, we are able to elegantly handle the scenario where one array empties before the other. We know that any actual element will be less than Infinity
so we will continually take the other element into our merged array.
In other words, we can safely handle this edge case:
merge([10, 13, 15, 25], []); // => [10, 13, 15, 25]
Nice! We now have a way to merge two sorted arrays into a single sorted array. It's worth mentioning that merge
will have a O(n)
runtime where n
is the combined length of the two input arrays. This is what we meant when we said it was "easy" to merge two sorted arrays; linear time is fast! We'll find fact this useful later.
Now that we satisfied the merge idea, let's handle the second point. That is, we say an array of 1 or 0 elements is already sorted. This will be the base case of our recursion. Let's begin adding this code:
function mergeSort(array) {
if (array.length <= 1) {
return array;
}
//...some code here.
}
If our base case pretains to an array of a very small size, then the design of our recursive case should make progress toward hitting this base scenario. In other words, we should recursively call `mergeSort` on smaller and smaller arrays. A logical way to do this is to take the input array and split it into left and right halves.
```js
function mergeSort(array) {
if (array.length <= 1) {
return array;
}
let midIdx = Math.floor(array.length / 2);
let leftHalf = array.slice(0, midIdx);
let rightHalf = array.slice(midIdx);
let sortedLeft = mergeSort(leftHalf);
let sortedRight = mergeSort(rightHalf);
//...some code here
}
Here is the part of the recursion where we do a lot of hand waving and we take things on faith. We know that `mergeSort` will take in an array and return the sorted version; we assume that it works. That means the two recursive calls will return the `sortedLeft` and `sortedRight` halves.
Okay, so we have two sorted arrays. We want to return one sorted array. So `merge` them! Using the `merge` function we designed earlier:
```js
function mergeSort(array) {
if (array.length <= 1) {
return array;
}
let midIdx = Math.floor(array.length / 2);
let leftHalf = array.slice(0, midIdx);
let rightHalf = array.slice(midIdx);
let sortedLeft = mergeSort(leftHalf);
let sortedRight = mergeSort(rightHalf);
return merge(sortedLeft, sortedRight);
}
```Wow. that's it. Notice how light the implementation of `mergeSort` is. Much of the heavy lifting (the actually comparisons) is done by the `merge` helper.
`mergeSort` is a classic example of a "Divide and Conquer" algorithm. In other words, we keep breaking the array into smaller and smaller sub arrays. This is the same as saying we take the problem and break it down into smaller and smaller subproblems. We do this until the subproblems are so small that we trivially know the answer to them (an array length 0 or 1 is already sorted). Once we have those subanswers we can combine to reconstruct the larger problems that we previously divided (merge the left and right subarrays).
The process is visualized below. When elements are moved to the bottom of the picture, they are going through the `merge` step:
![source: https://visualgo.net](https://s3-us-west-1.amazonaws.com/appacademy-open-assets/data_structures_algorithms/efficient_sorting_algorithms/merge_sort/images/MergeSort.gif)
### Merge Sort JS Implementation
Here is the full code for your reference:
```js
function merge(array1, array2) {
let merged = [];
while (array1.length || array2.length) {
let ele1 = array1.length ? array1[0] : Infinity;
let ele2 = array2.length ? array2[0] : Infinity;
let next;
if (ele1 < ele2) {
next = array1.shift();
} else {
next = array2.shift();
}
merged.push(next);
}
return merged;
}
``````js
function mergeSort(array) {
if (array.length <= 1) {
return array;
}
let midIdx = Math.floor(array.length / 2);
let leftHalf = array.slice(0, midIdx);
let rightHalf = array.slice(midIdx);
let sortedLeft = mergeSort(leftHalf);
let sortedRight = mergeSort(rightHalf);
return merge(sortedLeft, sortedRight);
}
The complexity analysis of this algorithm is easier to explain through visuals, so we highly encourage you to watch the lecture that accompanies this reading. In any case, here is a summary of the complexity:
n
is the length of the input arrayO(log(n))
.
32
32 -> 16 -> 8 -> 4 -> 2 -> 1
, we have to split 5 times before reaching the base case, log(32) = 5
merge
function, which contributes O(n)
in isolationmerge
in every recursive mergeSort
call, so the total complexity is O(n * log(n))Merge Sort is the first non-O(1) space sorting algorithm we've seen thus far.
The larger the size of our input array, the greater the number of subarrays we must create in memory. These are not free! They each take up finite space, and we will need a new subarray for each element in the original input. Therefore, Merge Sort has a linear space complexity, O(n).
Unless we, the engineers, have access in advance to some unique, exploitable insight about our dataset, it turns out that O(n log n) time is the best we can do when sorting unknown datasets.
That means that Merge Sort is fast! It's way faster than Bubble Sort, Selection Sort, and Insertion Sort. However, due to its linear space complexity, we must always weigh the tradeoff between speed and memory consumption when making the choice to use Merge Sort. Consider the following:
[7, 3, 8, 9, 2]
and a target of 5
, we know [3, 2]
are numbers less than 5
and [7, 8, 9]
are numbers greater than 5
.In general, the strategy is to divide the input array into two subarrays; one with the smaller elements, and one with the larger elements. Then, it recursively operates on the two new subarrays, and continues this process until we reach subarrays of length 1 or smaller. As we have seen with Merge Sort, arrays of such length are automatically sorted (for obvious reasons).
The steps, when discussed on a high level, are simple:
Before we move forward, see if you can observe the behavior described above in the following animation:
Let's hone in on the first major point above. Formally, we want to partition elements of an array relative to a pivot value. That is, we want elements less than the pivot to be separated from elements that are greater than or equal to the pivot. Our goal is to create a function with this behavior:
let arr = [7, 3, 8, 9, 2];
partition(arr, 5); // => [[3, 2], [7,8,9]]
Seems simple enough! Let's implement it in JavaScript: // nothing fancy
function partition(array, pivot) {
let left = [];
let right = [];
array.forEach((el) => {
if (el < pivot) {
left.push(el);
} else {
right.push(el);
}
});
return [left, right];
}
function partition(array, pivot) {
let left = array.filter((el) => el < pivot);
let right = array.filter((el) => el >= pivot);
return [left, right];
}
Both of the above implementations are correct, but we'll use the second one as it is cleaner. It's worth mentioning that the partition
function will have a runtime of O(n)
. forEach
and filter
both have linear, O(n)
, time complexity. Although our fancy partition
does filter twice, this is a constant we drop, O(2n) = O(n)
. Linear time is fast so we are quite happy with partition
.
We won't be using an explicit partition
helper function in our quickSort
implementation, however we will borrow heavily from this pattern. As you design algorithms, it helps to think about key patterns in isolation, although your solution may not feature that exact helper. Some would say we like to divide and conquer :).
Let's begin structuring the recursion. The base case of any recursive problem is where the input is so trivial, we immediately know the answer without calculation. If our problem is to sort an array, what is the trivial array? An array of 1 or 0 elements! Let's establish the code:
function quickSort(array) {
if (array.length <= 1) {
return array;
}
}
If our base case pretains to an array of a very small size, then the design of our recursive case should make progress toward hitting this base scenario. In other words, we should recursively call quickSort
on smaller and smaller arrays. This is very similar to our previous mergeSort
, except we don't just split the array down the middle. Instead we should arbitrarily choose an element of the array as a pivot and partition the remaining elements relative to this pivot:
function quickSort(array) {
if (array.length <= 1) {
return array;
}
let pivot = array.shift();
let left = array.filter((el) => el < pivot);
let right = array.filter((el) => el >= pivot);
//...some code here
Here is what to notice about the partition step above: 1. the pivot is an element of the array; we arbitrarily chose the first element 2. we removed the pivot from the master array before we filter into the left and right partitions
Now that we have the two subarrays of left
and right
we have our subproblems! To solve these subproblems we must sort the subarrays. I wish we had a function that sorts an array...oh wait we do, quickSort
! Recursively:
function quickSort(array) {
if (array.length <= 1) {
return array;
}
let pivot = array.shift();
let left = array.filter(el => el < pivot);
let right = array.filter(el => el >= pivot);
let leftSorted = quickSort(left);
let rightSorted = quickSort(right);
//...some code here
Okay, so we have the two sorted partitions. This means we have the two subsolutions. But how do we put them together? Think about how we partitioned them in the first place. Everything in leftSorted
is guaranteed to be less than everything in rightSorted
. On top of that, pivot
should be placed after the last element in leftSorted
, but before the first element in rightSorted
. So all we need to do is to combine the elements in the order "left, pivot, right"!
function quickSort(array) {
if (array.length <= 1) {
return array;
}
let pivot = array.shift();
let left = array.filter((el) => el < pivot);
let right = array.filter((el) => el >= pivot);
let leftSorted = quickSort(left);
let rightSorted = quickSort(right);
return leftSorted.concat([pivot]).concat(rightSorted);
}
That last concat
line is a bit clunky. Bonus JS Lesson: we can use the spread ...
operator to elegantly concat arrays. In general:
let one = ["a", "b"];
let two = ["d", "e", "f"];
let newArr = [...one, "c", ...two];
newArr; // => [ 'a', 'b', 'c', 'd', 'e', 'f' ]
Utilizing that spread pattern gives us this final implementation:
function quickSort(array) {
if (array.length <= 1) {
return array;
}
let pivot = array.shift();
let left = array.filter((el) => el < pivot);
let right = array.filter((el) => el >= pivot);
let leftSorted = quickSort(left);
let rightSorted = quickSort(right);
return [...leftSorted, pivot, ...rightSorted];
}
I'd hire that programmer.
That code was so clean we should show it again. Here's the complete code for your reference, for when you ctrl+F "quicksort"
the night before an interview:
function quickSort(array) {
if (array.length <= 1) {
return array;
}
let pivot = array.shift();
let left = array.filter((el) => el < pivot);
let right = array.filter((el) => el >= pivot);
let leftSorted = quickSort(left);
let rightSorted = quickSort(right);
return [...leftSorted, pivot, ...rightSorted];
}
The complexity analysis of this algorithm is easier to explain through visuals, so we highly encourage you to watch the lecture that accompanies this reading. In any case, here is a summary of the complexity.
The runtime analysis of quickSort
is more complex than mergeSort
n
is the length of the input arrayO(n)
O(log(n))
recursive calls to reach the base case.O(n)
recursive calls to reach the base case.Although we typically take the worst case when describing Big-O for an algorithm, much research on quickSort
has shown the worst case to be an exceedingly rare occurance even if we choose the pivot at random. Because of this we still consider quickSort
an efficient algorithm. This is a common interview talking point, so you should be familiar with the relationship between the choice of pivot and efficiency of the algorithm.
Just in case: A somewhat common question a student may ask when studying quickSort
is, "If the median is the best pivot, why don't we always just choose the median when we partition?" Don't overthink this. To know the median of an array, it must be sorted in the first place.
Our implementation of quickSort
uses O(n)
space because of the partition arrays we create. There is an in-place version of quickSort
that uses O(log(n))
space. O(log(n))
space is not huge benefit over O(n)
. You'll also find our version of quickSort
as easier to remember, easier to implement. Just know that a O(logn)
space quickSort
exists.
mergeSort
.We've explored many ways to sort arrays so far, but why did we go through all of that trouble? By sorting elements of an array, we are organizing the data in a way that gives us a quick way to look up elements later on. For simplicity, we have been using arrays of numbers up until this point. However, these sorting concepts can be generalized to other data types. For example, it would be easy to modify our comparison-based sorting algorithms to sort strings: instead of leveraging facts like 0 < 1
, we can say 'A' < 'B'
.
Think of a dictionary. A dictionary contains alphabetically sorted words and their definitions. A dictionary is pretty much only useful if it is ordered in this way. Let's say you wanted to look up the definition of "stupendous." What steps might you take?
You are essentially using the binarySearch
algorithm in the real world.
Formally, our binarySearch
will seek to solve the following problem:
Given a sorted array of numbers and a target num, return a boolean indicating whether or not that target is contained in the array.
Programmatically, we want to satisfy the following behavior:
binarySearch([5, 10, 12, 15, 20, 30, 70], 12); // => true
binarySearch([5, 10, 12, 15, 20, 30, 70], 24); // => false
Before we move on, really internalize the fact that binarySearch
will only work on sorted arrays! Obviously we can search any array, sorted or unsorted, in O(n)
time. But now our goal is be able to search the array with a sub-linear time complexity (less than O(n)
).
We'll implement binary search recursively. As always, we start with a base case that captures the scenario of the input array being so trivial, that we know the answer without further calculation. If we are given an empty array and a target, we can be certain that the target is not inside of the array:
function binarySearch(array, target) {
if (array.length === 0) {
return false;
}
//...some code here
}
Now for our recursive case. If we want to get a time complexity less than O(n)
, we must avoid touching all n
elements. Adopting our dictionary strategy, let's find the middle element and grab references to the left and right halves of the sorted array:
function binarySearch(array, target) {
if (array.length === 0) {
return false;
}
let midIdx = Math.floor(array.length / 2);
let leftHalf = array.slice(0, midIdx);
let rightHalf = array.slice(midIdx + 1);
//...some code here
}
It's worth pointing out that the left and right halves do not contain the middle element we chose.
Here is where we leverage the sorted property of the array. If the target is less than the middle, then the target must be in the left half of the array. If the target is greater than the middle, then the target must be in the right half of the array. So we can narrow our search to one of these halves, and ignore the other. Luckily we have a function that can search the half, its binarySearch
:
function binarySearch(array, target) {
if (array.length === 0) {
return false;
}
let midIdx = Math.floor(array.length / 2);
let leftHalf = array.slice(0, midIdx);
let rightHalf = array.slice(midIdx + 1);
if (target < array[midIdx]) {
return binarySearch(leftHalf, target);
} else if (target > array[midIdx]) {
return binarySearch(rightHalf, target);
}
//...some code here
}
We know binarySeach
will return the correct boolean, so we just pass that result up by returning it ourselves. However, something is lacking in our code. It is only possible to get a false from the literal return false
line, but there is no return true
. Looking at our conditionals, we handle the cases where the target is less than middle or the target is greater than the middle, but what if the product is equal to the middle? If the target is equal to the middle, then we found the target and should return true
! This is easy to add with an else
:
function binarySearch(array, target) {
if (array.length === 0) {
return false;
}
let midIdx = Math.floor(array.length / 2);
let leftHalf = array.slice(0, midIdx);
let rightHalf = array.slice(midIdx + 1);
if (target < array[midIdx]) {
return binarySearch(leftHalf, target);
} else if (target > array[midIdx]) {
return binarySearch(rightHalf, target);
} else {
return true;
}
}
To wrap up, we have confidence of our base case will eventually be hit because we are continually halving the array. We halve the array until it's length is 0 or we actually find the target.
Here is the code again for your quick reference:
function binarySearch(array, target) {
if (array.length === 0) {
return false;
}
let midIdx = Math.floor(array.length / 2);
let leftHalf = array.slice(0, midIdx);
let rightHalf = array.slice(midIdx + 1);
if (target < array[midIdx]) {
return binarySearch(leftHalf, target);
} else if (target > array[midIdx]) {
return binarySearch(rightHalf, target);
} else {
return true;
}
}
The complexity analysis of this algorithm is easier to explain through visuals, so we highly encourage you to watch the lecture that accompanies this reading. In any case, here is a summary of the complexity:
n
is the length of the input arraylog(n)
n = 8
8 -> 4 -> 2 -> 1
log(8) = 3
Our implementation uses n
space due to half arrays we create using slice. Note that JavaScript slice
creates a new array, so it requires additional memory to be allocated.
Use this algorithm when the input data is sorted!!! This is a heavy requirement, but if you have it, you'll have an insanely fast algorithm.
Dynamic Programming is a design pattern used to solve a large problem by dividing it into smaller subproblems that are more manageable. Dynamic Programming will solve the subproblems efficiently, meaning that we avoid duplicate calculations and only "solve" each subproblem once by storing subsolutions in some additional data structure. We cannot always apply Dynamic Programming to a problem. Problems that can be solved with Dynamic Programming must have a sense of repetitive subproblems that overlap.
Here's an example of a problem that has such a structure:
// Using pennies, nickels, dimes, and quarters,
// what is the smallest combination of coins that
// total 27 cents?
We'll explore this exact problem in depth later on. For now, here is some food for thought. Along the way to calculating the smallest coin combination of 27 cents, we should also calculate the smallest coin combination of say, 25 cents as a component of that problem. This is the essence of an overlapping subproblem structure.
There are two strategies we can use to implement Dynamic Programming: Memoization and Tabulation. Let's explore Memoization first!
Let's first implement Dynamic Programming through memoization. In particular, we'll apply the memoization technique to recursive code. The underlying idea of memoization is this: every time we call a function with a particular argument, we expect to get the same result every time. Memoization allows us to store the result of a function in an object so it can be recalled later on. There are two features that comprise Memoization:
Let's begin by memoizing our previous factorial
recursive function. As it is, our factorial
is pretty fast with a O(n)
runtime. This is because we simply subtract 1
from n
for every recursive call until n
reaches 0
. This is feasibly the fastest we could ever do, but we'll memoize it nonetheless to explore the mechanics of memoization:
function factorial(n) {
if (n === 1) return 1;
return n * factorial(n - 1);
}
factorial(6); // => 720, requires 6 calls
factorial(6); // => 720, requires 6 calls
factorial(5); // => 120, requires 5 calls
factorial(7); // => 5040, requires 7 calls
From our plain factorial
above, it is clear that every time we call factorial(6)
we should get the same result of 720
each time. The code is somewhat inefficient because we must go down the full recursive stack for each top level call to factorial(6)
. It would be great if we could store the result of factorial(6)
the first time we calculate it, then on subsequent calls to factorial(6)
we simply fetch the stored result in constant time. We can accomplish exactly this by memoizing with an object! We'll refactor the code later, but for now:
let memo = {}
function factorial(n) {
// if we have calculated factorial(n) previously, fetch the stored result in memo
if (n in memo) return memo[n];
if (n === 1) return 1;
// otherwise, we have not calculated factorial(n) previously, so calculate it now,
// but store the result in case we need it again in the future
memo[n] = n * factorial(n - 1);
return memo[n];
}
factorial(6); // => 720, requires 6 calls
factorial(6); // => 720, requires 1 call
factorial(5); // => 120, requires 1 call
factorial(7); // => 5040, requires 2 calls
memo; // => { '2': 2, '3': 6, '4': 24, '5': 120, '6': 720, '7': 5040 }
The memo
object above will map an argument of factorial
to it's return value. That is, the keys will be arguments and their values will be the corresponding results returned. By using the memo, we are able to avoid duplicate recursive calls! Here's some food for thought: By the time our first call to factorial(6)
returns, we will not have just the arg 6
stored in the memo. Rather, we will have all args 2 to 6 stored in the memo.
Hopefully you sense the efficiency we can get by memoizing our functions, but maybe you are not convinced by the last example for two reasons:
Both of those points are true, so let's take a look at a more advanced but also practical example that benefits from memoization.
Let's begin with our previous fib
implementation that calculates the n-th number in the fibonacci sequence:
function fib(n) {
if (n === 1 || n === 2) return 1;
return fib(n - 1) + fib(n - 2);
}
fib(6); // => 8
Before we optimize this, let's ask what complexity class it falls into in the first place. Hmm, the time complexity of this function is not super intuitive to describe because the code branches twice recursively. Fret not! You'll find it useful to visualize a tree when reasoning about the time complexity for recursive functions. Every node of the tree represents a call of the recursion:
In general, the height of this tree will be n
. We derive this by following the path going straight down the left side of the tree. We can also see that each internal node leads to two more nodes. Overall, this means our tree will have roughly 2n nodes which is the same as saying our fib
function has an exponential time complexity of 2n. That is very slow! See for yourself, try running fib(50)
- you'll be waiting for quite a while (it took 3 minutes on our machine).
Okay. So our fib
is slow, but is there anyway to speed it up? Take a look at the tree above. Can you find any repetitive regions of the tree? We'll highlight a few:
As the n
grows bigger, the number of duplicate subtrees grows exponentially. Luckily we can fix this using memoization. We'll use a similar object strategy as before, but we'll indulge in some JavaScript default arguments to clean things up:
function fastFib(n, memo = {}) {
if (n in memo) return memo[n];
if (n === 1 || n === 2) return 1;
memo[n] = fastFib(n - 1, memo) + fastFib(n - 2, memo);
return memo[n];
}
fastFib(6); // => 8
fastFib(50); // => 12586269025
The code above can calculate the 50th fibonacci number almost instantly! Thanks to our memo, we only need to explore a subtree fully once. Visually, our fastFib
recursion has this structure:
We marked nodes (function calls) that access the memo in green. It's easy to see that we will do far less computations as n
grows larger! In fact, we have brought the time complexity down to linear O(n)
time because the tree only branches on the left side. This is an enormous gain if you recall the complexity class hierarchy.
Now that we have memoization under our belts, when should we apply it? Memoization is useful when attacking recursive problems that have many overlapping subproblems. You'll find it most useful to draw out the visual tree first. If you notice duplicate subtrees, time to memoize. Here are the hard and fast rules you can use to memoize a slow function:
Now that we are familiar with Memoization, let's explore another Dynamic Programming strategy: Tabulation. In case you forgot, Memoization and Tabulation are two distinct Dynamic Programming strategies. That being said, our goal for Tabulation is still to solve large problems efficiently by breaking them down into many subproblems. There are two main features that comprise the Tabulation strategy:
Many problems that can be solved with Memoization can also be solved with Tabulation, so let's begin by attacking a familar problem with a fresh set of eyes. Don't worry, we'll also work on some brand new problems in the upcoming project.
Tabulation is all about creating a table (array) and filling it out with elements. In general, we will complete the table by filling entries from left to right. This means that the first entry of our table (first element of the array) will correspond to the smallest subproblem. Naturally, the final entry of our table (last element of the array) will correspond to the largest problem, which is also our final answer.
Let's tabulate fib
. As always, we want fib(n)
to return the n-th number of the Fibonacci sequence:
// fib(0); // => 0
// fib(1); // => 1
// fib(2); // => 1
// fib(6); // => 8
// fib(7); // => 13
Let's jump straight into the code:
function tabulatedFib(n) {
// create a blank array of length `n`
let table = new Array(n);
// seed the first two values
table[0] = 0;
table[1] = 1;
// complete the table by moving from left to right,
// following the fibonacci pattern
for (let i = 2; i <= n; i++) {
table[i] = table[i - 1] + table[i - 2];
}
return table[n];
}
console.log(tabulatedFib(7)); // => 13
Visually, we initialized the table with the following structure:
i | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|---|
table[i] |
0 |
1 |
After our loop finishes, the final table will be:
i | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|---|
table[i] |
0 |
1 |
1 |
2 |
3 |
5 |
8 |
13 |
Similar to our previous memo
, by the time our function completes, our table
will contain our final solution as well as all subsolutions calculated along the way.
The analysis of our tabulatedFib
is very straightforward, since the code is iterative. The dominant operation in the function is the loop used to fill out the entire table. The length of the table is roughly n
elements long, so our algorithm will have an O(n) runtime. The space taken by our algorithm is also O(n) due to the size of the table. Overall, we should be satisfied with the effiency of our algorithm.
You may notice that we can cut down on the space used by our function. At any point of our loop we really only need the previous two subproblems, so there is little utility to storing the full array. This refactor is easy to do by using two variables:
function fib(n) {
if (n === 0) return 0;
if (n === 1) return 1;
let secondLast = 0;
let last = 1;
for (let i = 2; i <= n; i++) {
let temp = last;
last = last + secondLast;
secondLast = temp;
}
return last;
}
Bam! We now have O(n) runtime and O(1) space. This is the most optimal algorithm for calculating fib
. Note that this strategy is not quite Tabulation, since there is no table array being used. However, this still falls under the overarching category of Dynamic Programming since we saved previous subproblem results in order to calculate the final answer. There's no fancy name for this strategy; it's just amazing.
Here are our general guidelines for implementing a Tabulation strategy. Bear in mind that Dynamic Programming (whether it be by Tabulation or Memoization) is only applicable to problems that can be divided into many subproblems of similar structure. This is just a general recipe so adjust for taste depending on your problem:
A Linked List data structure represents a linear sequence of "vertices" (or "nodes"), and tracks three important properties.
Linked List Properties:
Property | Description |
---|---|
head |
The first node in the list. |
tail |
The last node in the list. |
length |
The number of nodes in the list; the list's length. |
The data being tracked by a particular Linked List does not live inside the Linked List instance itself. Instead, each vertex is actually an instance of an even simpler, smaller data structure, often referred to as a "Node".
Depending on the type of Linked List (there are many), Node instances track some very important properties as well.
Linked List Node Properties:
Property | Description |
---|---|
value |
The actual value this node represents. |
next |
The next node in the list (relative to this node). |
previous |
The previous node in the list (relative to this node). |
NOTE: The previous
property is for Doubly Linked Lists only!
Linked Lists contain ordered data, just like arrays. The first node in the list is, indeed, first. From the perspective of the very first node in the list, the next node is the second node. From the perspective of the second node in the list, the previous node is the first node, and the next node is the third node. And so it goes.
Admittedly, this does sound a lot like an Array so far, and that's because Arrays and Linked Lists are both implementations of the List ADT. However, there is an incredibly important distinction to be made between Arrays and Linked Lists, and that is how they physically store their data. (As opposed to how they represent the order of their data.)
Recall that Arrays contain contiguous data. Each element of an array is actually stored next to it's neighboring element in the actual hardware of your machine, in a single continuous block in memory.
An Array's contiguous data being stored in a continuous block of addresses in memory.
Unlike Arrays, Linked Lists contain non-contiguous data. Though Linked Lists represent data that is ordered linearly, that mental model is just that - an interpretation of the representation of information, not reality.
In reality, in the actual hardware of your machine, whether it be in disk or in memory, a Linked List's Nodes are not stored in a single continuous block of addresses. Rather, Linked List Nodes live at randomly distributed addresses throughout your machine! The only reason we know which node comes next in the list is because we've assigned its reference to the current node's next
pointer.
A Singly Linked List's non-contiguous data (Nodes) being stored at randomly distributed addresses in memory.
For this reason, Linked List Nodes have no indices, and no random access. Without random access, we do not have the ability to look up an individual Linked List Node in constant time. Instead, to find a particular Node, we have to start at the very first Node and iterate through the Linked List one node at a time, checking each Node's next Node until we find the one we're interested in.
So when implementing a Linked List, we actually must implement both the Linked List class and the Node class. Since the actual data lives in the Nodes, it's simpler to implement the Node class first.
There are four flavors of Linked List you should be familiar with when walking into your job interviews.
Linked List Types:
List Type | Description | Directionality |
---|---|---|
Singly Linked | Nodes have a single pointer connecting them in a single direction. | Head→Tail |
Doubly Linked | Nodes have two pointers connecting them bi-directionally. | Head⇄Tail |
Mulitply Linked | Nodes have two or more pointers, providing a variety of potential node orderings. | Head⇄Tail, A→Z, Jan→Dec, etc. |
Circularly Linked | Final node's next pointer points to the first node, creating a non-linear, circular version of a Linked List. |
Head→Tail→Head→Tail |
NOTE: These Linked List types are not always mutually exclusive.
For instance:
You are most likely to encounter Singly and Doubly Linked Lists in your upcoming job search, so we are going to focus exlusively on those two moving forward. However, in more senior level interviews, it is very valuable to have some familiarity with the other types of Linked Lists. Though you may not actually code them out, you will win extra points by illustrating your ability to weigh the tradeoffs of your technical decisions by discussing how your choice of Linked List type may affect the efficiency of the solutions you propose.
Linked Lists are great foundation builders when learning about data structures because they share a number of similar methods (and edge cases) with many other common data structures. You will find that many of the concepts discussed here will repeat themselves as we dive into some of the more complex non-linear data structures later on, like Trees and Graphs.
In the project that follows, we will implement the following Linked List methods:
Type | Name | Description | Returns |
---|---|---|---|
Insertion | addToTail |
Adds a new node to the tail of the Linked List. | Updated Linked List |
Insertion | addToHead |
Adds a new node to the head of the Linked List. | Updated Linked List |
Insertion | insertAt |
Inserts a new node at the "index", or position, specified. | Boolean |
Deletion | removeTail |
Removes the node at the tail of the Linked List. | Removed node |
Deletion | removeHead |
Removes the node at the head of the Linked List. | Removed node |
Deletion | removeFrom |
Removes the node at the "index", or position, specified. | Removed node |
Search | contains |
Searches the Linked List for a node with the value specified. | Boolean |
Access | get |
Gets the node at the "index", or position, specified. | Node at index |
Access | set |
Updates the value of a node at the "index", or position, specified. | Boolean |
Meta | size |
Returns the current size of the Linked List. | Integer |
Before we begin our analysis, here is a quick summary of the Time and Space constraints of each Linked List Operation. The complexities below apply to both Singly and Doubly Linked Lists:
Data Structure Operation | Time Complexity (Avg) | Time Complexity (Worst) | Space Complexity (Worst) |
---|---|---|---|
Access | Θ(n) |
O(n) |
O(n) |
Search | Θ(n) |
O(n) |
O(n) |
Insertion | Θ(1) |
O(1) |
O(n) |
Deletion | Θ(1) |
O(1) |
O(n) |
Before moving forward, see if you can reason to yourself why each operation has the time and space complexity listed above!
Unlike Arrays, Linked Lists Nodes are not stored contiguously in memory, and thereby do not have an indexed set of memory addresses at which we can quickly lookup individual nodes in constant time. Instead, we must begin at the head of the list (or possibly at the tail, if we have a Doubly Linked List), and iterate through the list until we arrive at the node of interest.
In Scenario 1, we'll know we're there because we've iterated 8 times. In Scenario 2, we'll know we're there because, while iterating, we've checked each node's value and found one that matches our target value, "Q".
In the worst case scenario, we may have to traverse the entire Linked List until we arrive at the final node. This makes both Access & Search Linear Time operations.
Since we have our Linked List Nodes stored in a non-contiguous manner that relies on pointers to keep track of where the next and previous nodes live, Linked Lists liberate us from the linear time nature of Array insertions and deletions. We no longer have to adjust the position at which each node/element is stored after making an insertion at a particular position in the list. Instead, if we want to insert a new node at position i
, we can simply:
next
and previous
pointers to the nodes that live at postions i
and i - 1
, respectively.next
pointer of the node that lives at position i - 1
to point to the new node.previous
pointer of the node that lives at position i
to point to the new node.And we're done, in Constant Time. No iterating across the entire list necessary.
"But hold on one second, " you may be thinking. "In order to insert a new node in the middle of the list, don't we have to lookup its position? Doesn't that take linear time?!"
Yes, it is tempting to call insertion or deletion in the middle of a Linked List a linear time operation since there is lookup involved. However, it's usually the case that you'll already have a reference to the node where your desired insertion or deletion will occur.
For this reason, we separate the Access time complexity from the Insertion/Deletion time complexity, and formally state that Insertion and Deletion in a Linked List are Constant Time across the board.
Without a reference to the node at which an insertion or deletion will occur, due to linear time lookup, an insertion or deletion in the middle of a Linked List will still take Linear Time, sum total.
It's obvious that Linked Lists have one node for every one item in the list, and for that reason we know that Linked Lists take up Linear Space in memory. However, when asked in an interview setting what the Space Complexity of your solution to a problem is, it's important to recognize the difference between the two scenarios above.
In Scenario 1, we are not creating a new Linked List. We simply need to operate on the one given. Since we are not storing a new node for every node represented in the Linked List we are provided, our solution is not necessarily linear in space.
In Scenario 2, we are creating a new Linked List. If the number of nodes we create is linearly correlated to the size of our input data, we are now operating in Linear Space.
Linked Lists can be traversed both iteratively and recursively. If you choose to traverse a Linked List recursively, there will be a recursive function call added to the call stack for every node in the Linked List. Even if you're provided the Linked List, as in Scenario 1, you will still use Linear Space in the call stack, and that counts.
Stacks and Queues aren't really "data structures" by the strict definition of the term. The more appropriate terminology would be to call them abstract data types (ADTs), meaning that their definitions are more conceptual and related to the rules governing their user-facing behaviors rather than their core implementations.
For the sake of simplicity, we'll refer to them as data structures and ADTs interchangeably throughout the course, but the distinction is an important one to be familiar with as you level up as an engineer.
Now that that's out of the way, Stacks and Queues represent a linear collection of nodes or values. In this way, they are quite similar to the Linked List data structure we discussed in the previous section. In fact, you can even use a modified version of a Linked List to implement each of them. (Hint, hint.)
These two ADTs are similar to each other as well, but each obey their own special rule regarding the order with which Nodes can be added and removed from the structure.
Since we've covered Linked Lists in great length, these two data structures will be quick and easy. Let's break them down individually in the next couple of sections.
Stacks are a Last In First Out (LIFO) data structure. The last Node added to a stack is always the first Node to be removed, and as a result, the first Node added is always the last Node removed.
The name Stack actually comes from this characteristic, as it is helpful to visualize the data structure as a vertical stack of items. Personally, I like to think of a Stack as a stack of plates, or a stack of sheets of paper. This seems to make them more approachable, because the analogy relates to something in our everyday lives.
If you can imagine adding items to, or removing items from, a Stack of...literally anything...you'll realize that every (sane) person naturally obeys the LIFO rule.
We add things to the top of a stack. We remove things from the top of a stack. We never add things to, or remove things from, the bottom of the stack. That's just crazy.
Note: We can use JavaScript Arrays to implement a basic stack. Array#push
adds to the top of the stack and Array#pop
will remove from the top of the stack. In the exercise that follows, we'll build our own Stack class from scratch (without using any arrays). In an interview setting, your evaluator may be okay with you using an array as a stack.
Queues are a First In First Out (FIFO) data structure. The first Node added to the queue is always the first Node to be removed.
The name Queue comes from this characteristic, as it is helpful to visualize this data structure as a horizontal line of items with a beginning and an end. Personally, I like to think of a Queue as the line one waits on for an amusement park, at a grocery store checkout, or to see the teller at a bank.
If you can imagine a queue of humans waiting...again, for literally anything...you'll realize that most people (the civil ones) naturally obey the FIFO rule.
People add themselves to the back of a queue, wait their turn in line, and make their way toward the front. People exit from the front of a queue, but only when they have made their way to being first in line.
We never add ourselves to the front of a queue (unless there is no one else in line), otherwise we would be "cutting" the line, and other humans don't seem to appreciate that.
Note: We can use JavaScript Arrays to implement a basic queue. Array#push
adds to the back (enqueue) and Array#shift
will remove from the front (dequeue). In the exercise that follows, we'll build our own Queue class from scratch (without using any arrays). In an interview setting, your evaluator may be okay with you using an array as a queue.
Stacks and Queues are so similar in composition that we can discuss their properties together. They track the following three properties:
Stack Properties | Queue Properties:
Stack Property | Description | Queue Property | Description |
---|---|---|---|
top |
The first node in the Stack | front |
The first node in the Queue |
bottom |
The last node in the Stack. (Optional) | back |
The last node in the Queue. |
length |
The number of nodes in the Stack; the Stack's length. | length |
The number of nodes in the Queue; the Queue's length. |
Notice that rather than having a head
and a tail
like Linked Lists, Stacks have a top
and a bottom
, and Queues have a front
and a back
instead. These properties are essentially the same; pointers to the end points of the respective List ADT where important actions way take place. The differences in naming conventions are strictly for human comprehension.
Similarly to Linked Lists, the values stored inside a Stack or a Queue are actually contained within Stack Node and Queue Node instances. Stack, Queue, and Singly Linked List Nodes are all identical, but just as a reminder and for the sake of completion, these List Nodes track the following two properties:
Stack & Queue Node Properties:
Property | Description |
---|---|
value |
The actual value this node represents. |
next |
The next node in the Stack (relative to this node). |
In the exercise that follows, we will implement a Stack data structure along with the following Stack methods:
Type | Name | Description | Returns |
---|---|---|---|
Insertion | push |
Adds a Node to the top of the Stack. | Integer - New size of stack |
Deletion | pop |
Removes a Node from the top of the Stack. | Node removed from top of Stack |
Meta | size |
Returns the current size of the Stack. | Integer |
The following code is the preferred implementation of a Stack
ADT:
class Node {
constructor(val) {
this.value = val;
this.next = null;
}
}
class Stack {
constructor() {
this.top = null;
this.bottom = null;
this.length = 0;
}
push(val) {
const newNode = new Node(val);
if (!this.top) {
this.top = newNode;
this.bottom = newNode;
} else {
const temp = this.top;
this.top = newNode;
this.top.next = temp;
}
return ++this.length;
}
pop() {
if (!this.top) {
return null;
}
const temp = this.top;
if (this.top === this.bottom) {
this.bottom = null;
}
this.top = this.top.next;
this.length--;
return temp.value;
}
size() {
return this.length;
}
}
In the exercise that follows, we will implement a Queue data structure along with the following Queue methods:
Type | Name | Description | Returns |
---|---|---|---|
Insertion | enqueue |
Adds a Node to the front of the Queue. | Integer - New size of Queue |
Deletion | dequeue |
Removes a Node from the front of the Queue. | Node removed from front of Queue |
Meta | size |
Returns the current size of the Queue. | Integer |
The following code is the preferred implementation of a Queue
ADT:
class Node {
constructor(val) {
this.value = val;
this.next = null;
}
}
class Queue {
constructor() {
this.front = null;
this.back = null;
this.length = 0;
}
enqueue(val) {
const newNode = new Node(val);
if (!this.front) {
this.front = newNode;
this.back = newNode;
} else {
this.back.next = newNode;
this.back = newNode;
}
return ++this.length;
}
dequeue() {
if (!this.front) {
return null;
}
const temp = this.front;
if (this.front === this.back) {
this.back = null;
}
this.front = this.front.next;
this.length--;
return temp.value;
}
size() {
return this.length;
}
}
Before we begin our analysis, here is a quick summary of the Time and Space constraints of each Stack Operation.
Data Structure Operation | Time Complexity (Avg) | Time Complexity (Worst) | Space Complexity (Worst) |
---|---|---|---|
Access | Θ(n) |
O(n) |
O(n) |
Search | Θ(n) |
O(n) |
O(n) |
Insertion | Θ(1) |
O(1) |
O(n) |
Deletion | Θ(1) |
O(1) |
O(n) |
Before moving forward, see if you can reason to yourself why each operation has the time and space complexity listed above!
When the Stack ADT was first conceived, its inventor definitely did not prioritize searching and accessing individual Nodes or values in the list. The same idea applies for the Queue ADT. There are certainly better data structures for speedy search and lookup, and if these operations are a priority for your use case, it would be best to choose something else!
Search and Access are both linear time operations for Stacks and Queues, and that shouldn't be too unclear. Both ADTs are nearly identical to Linked Lists in this way. The only way to find a Node somewhere in the middle of a Stack or a Queue, is to start at the top
(or the back
) and traverse downward (or forward) toward the bottom
(or front
) one node at a time via each Node's next
property.
This is a linear time operation, O(n).
For Stacks and Queues, insertion and deletion is what it's all about. If there is one feature a Stack absolutely must have, it's constant time insertion and removal to and from the top
of the Stack (FIFO). The same applies for Queues, but with insertion occuring at the back
and removal occuring at the front
(LIFO).
Think about it. When you add a plate to the top of a stack of plates, do you have to iterate through all of the other plates first to do so? Of course not. You simply add your plate to the top of the stack, and that's that. The concept is the same for removal.
Therefore, Stacks and Queues have constant time Insertion and Deletion via their push
and pop
or enqueue
and dequeue
methods, O(1).
The space complexity of Stacks and Queues is very simple. Whether we are instantiating a new instance of a Stack or Queue to store a set of data, or we are using a Stack or Queue as part of a strategy to solve some problem, Stacks and Queues always store one Node for each value they receive as input.
For this reason, we always consider Stacks and Queues to have a linear space complexity, O(n).
At this point, we've done a lot of work understanding the ins and outs of Stacks and Queues, but we still haven't really discussed what we can use them for. The answer is actually...a lot!
For one, Stacks and Queues can be used as intermediate data structures while implementing some of the more complicated data structures and methods we'll see in some of our upcoming sections.
For example, the implementation of the breadth-first Tree traversal algorithm takes advantage of a Queue instance, and the depth-first Graph traveral algorithm exploits the benefits of a Stack instance.
Additionally, Stacks and Queues serve as the essential underlying data structures to a wide variety of applications you use all the time. Just to name a few:
push
ing that event to a Stack.pop
ed off the Stack, because the last event that occured should be the first one to be undone (LIFO).push
ed back onto the Stack.Binary Trees are perhaps the most pervasive data structure in computer science. Let's take a moment to go over the basic characteristics of a Binary Tree before we explore algorithms that utilize this structure.
Before we define what a Tree is, we must first understand the definition of a Graph. A graph is a collection of nodes and any edges between those nodes. You've likely seen depictions of graphs before, they usually exist as circles (nodes) and arrows (edges) between those circles. Below are few examples of graphs:
For now, you can ignore the blue coloring. Notice how the graphs above vary greatly in their structure. A graph is indeed a very broad, overarching category. In fact, linked lists and trees are both considered subclasses of graphs. We'll cover algorithms that operate on a general graph structure later, but for now we want to focus on what graphs are trees and what graphs are not. It's worth mentioning that a single node with no edges (image 1) is considered a graph. The empty graph (a graph with 0 nodes and 0 edges, not pictured :)) is also still a graph. This line of thinking will help us later when we design graph algorithms.
A Tree is a Graph that does not contain any cycles. A cycle is is defined as a path through edges that begins and ends at the same node. This seems straightforward, but the definition becomes a bit muddled as Mathematicians and Computer Scientists use the term "tree" in slightly different ways. Lets break it down:
Well, at least both camps agree that graph 5 is most certainly not a tree! This is because of the obvious cycle that spans all three nodes. However, why is there disagreement over graph 4? The reason is this: In computer science, we use to the term "tree" to really refer to a "rooted tree." A "rooted tree" is a "tree" where there exists a special node from which every other node is accessible; we call this special node the "root". Think of the root as ultimate ancestor, the single node that all other nodes inherit from. Above we have colored all roots in blue. Like you'd probably suspect, in this course we'll subscribe to the Computer Scientist's interpretation. That is, we won't consider graph 4 a tree because there is no such root we can label.
You've probably heard the term "root" throughout your software engineering career: root directory, root user, etc.. All of these concepts branch† from the humble tree data structure!
A Binary Tree is a Tree where nodes have at most 2 children. This means graphs 1, 2, and 3 are all Binary Trees. There exist ternary trees (at most 3 children) and n-ary trees (at most n children), but you'll likely encounter binary trees in your job hunt, so we'll focus on them in this course. Based on our final definition for a binary tree, here is some food for thought:
Take a moment to use the definitions we explored to verify that each of the three statements above is true. We bring up these three scenarios in particular because they are the simplest types of Binary Trees. We want to eventually build elegant algorithms and these simple scenarios will fuel our design.
Let's explore a common way to represent binary trees using some object oriented design. A tree is a collection of nodes, so let's implement a TreeNode
class. We'll use properties of left
and right
to reference the children of a TreeNode
. That is, left
and right
will reference other TreeNode
s:
class TreeNode {
constructor(val) {
this.val = val;
this.left = null;
this.right = null;
}
}
Constructing a tree is a matter of creating the nodes and setting `left` and `right` however we please. For example:
let a = new TreeNode('a');
let b = new TreeNode('b');
let c = new TreeNode('c');
let d = new TreeNode('d');
let e = new TreeNode('e');
let f = new TreeNode('f');
a.left = b;
a.right = c;
b.left = d;
b.right = e;
c.right = f;
The visual representation of the tree is:
To simplify our diagrams, we'll omit the arrowheads on the edges. Moving forward you can assume that the top node is the root and the direction of edges points downward. In other words, node A is the Root. Node A can access node B through a.left
, but Node B cannot access Node A.
We now have a data structure we can use to explore Binary Tree algorithms! Creating a tree in this way may be tedious and repetitive, however it allows us to decide exactly what nodes are connected and in what direction. This is will be useful as we account for edge cases in our design.
Now that we have the basic definition of a binary tree, let's begin with three short algorithms that print out the values. The algorithms are structurally the same, however they will differ in what order the values are printed. We'll use the following tree as the input when running these algorithms:
Let's begin with the inOrderPrint
function. All three of our algorithms will be recursive and have the same base case. As always, our base case should cover the scenario where the input is trivially small enough so that we don't need to perform further calculation. Since our "problem" is to print all values in a tree, what is the simplest tree we can be given? The empty tree! A common mistake when designing recursive tree algorithms is to make the base case about the root being a leaf, instead we'll want the basecase to cover the root being empty:
function inOrderPrint(root) {
if (root === null) return;
//...some code here
}
Note that taking in an entire tree as input is really just a matter of taking in the root node. This is because the root node can access every other node through a path of edges. Our base case says, "if the tree is empty, return since there is nothing to print."
Here is where the meat of the algorithm comes in. Given the root of a tree, the steps for inOrderPrint
are:
- print all nodes in the left subtree
- print root
- print all nodes in the right subtree
Translating this into code:
function inOrderPrint(root) {
if (!root) return;
inOrderPrint(root.left);
console.log(root.val);
inOrderPrint(root.right);
}
Given our tree, `inOrderPrint` would print the values in the order: `d, b, e, a, c, f`
In-Order has the pattern of left, self, right. This means:
- a node can only be printed once it's left subtree has been completely printed.
- a node's right subtree can only be printed once the node itself has been printed.
## Pre-Order
Given the root of a tree, the steps for `preOrderPrint` are:
- print root
- print all nodes in the left subtree
- print all nodes in the right subtree
Translating this into code:
```js
function preOrderPrint(root) {
if (!root) return;
console.log(root.val);
preOrderPrint(root.left);
preOrderPrint(root.right);
}
Given our tree, preOrderPrint
would print the values in the order: a, b, d, e, c, f
Pre-Order has the patten of self, left, right. This means:
Given the root of a tree, the steps for postOrderPrint
are:
- print all nodes in the left subtree
- print all nodes in the right subtree
- print root
Translating this into code:
function postOrderPrint(root) {
if (!root) return;
postOrderPrint(root.left);
postOrderPrint(root.right);
console.log(root.val);
}
Given our Tree, postOrderPrint
would print the values in the order: d, e, b, f, c, a
Post-Order has the pattern of left, right, self. This means:
We can also describe a BST using a recursive definition. A Binary Tree is a Binary Search Tree if:
It's worth mentioning that the empty tree (a tree with 0 nodes) is indeed a BST (did someone say base case?).
Here are a few examples of BSTs:
Take a moment to verify that the above binary trees are BSTs. Note that image 2 has the sane chain structure as a linked list. This will come into play later.
Below is an example of a binary tree that is not a search tree because a left child (35) is greater than it's parent (23):
So what's the big deal with BSTs? Well, because of the properties of a BST, we can consider the tree as having an order to the values. That means the values are fully sorted! By looking at the three BST examples above, you are probably not convinced of things being sorted. This is because the ordering is encoded by an inorder traversal. Let's recall our previous inOrderPrint
function:
function inOrderPrint(root) {
if (!root) return;
inOrderPrint(root.left);
console.log(root.val);
inOrderPrint(root.right);
}
If we run inOrderPrint
on the three BSTs, we will get the following output:
BST 1: 42
BST 2: 4, 5, 6
BST 3: 1, 5, 7, 10, 16, 16
For each tree, we printed out values in increasing order! A binary search tree contains sorted data; this will come into play when we perform algorithms on this data structure.
Let's implement a BST
class that will maintain the ordered property through any number of insertions into the tree. We are going to avoid manually creating all nodes and explicitly setting left
s and right
s, so we don't have to worry about breaking order. We'll use our classic TreeNode
as a component of BST
. In addition, we'll need a proper BST#insert
method that will conduct legal insertions on the tree. Interpret the code below and scroll further to our annotated version when you need clarification:
class TreeNode {
constructor(val) {
this.val = val;
this.left = null;
this.right = null;
}
}
class BST {
constructor() {
this.root = null;
}
insert(val, root = this.root) {
if (!this.root) {
this.root = new TreeNode(val);
return;
}
if (val < root.val) {
if (!root.left) {
root.left = new TreeNode(val);
} else {
this.insert(val, root.left);
}
} else {
if (!root.right) {
root.right = new TreeNode(val);
} else {
this.insert(val, root.right);
}
}
}
}
// commented naive BST class
class BST {
constructor() {
// initialize the tree to be empty
this.root = null;
}
insert(val, root=this.root) {
// if the tree is currently empty, then create the node as the 'absolute' root
if(!this.root) {
this.root = new TreeNode(val);
return;
}
// otherwise, the tree is not empty, so...
// if our val to insert is less than the root...
if (val < root.val) {
if (!root.left) { //...some code hereand the left child does not exist,
root.left = new TreeNode(val); // then create the node as the left child
} else { //...some code hereand the left child already exists,
this.insert(val, root.left); // then recursively insert on the left subtree
}
// if our val to insert is greater than or equal to the root...
} else {
if (!root.right) { // ...and the right child does not exist,
root.right = new TreeNode(val); // then create the node as the right child
} else { // ...and the right child already exists,
this.insert(val, root.right); // then recursively insert on the right subtree
}
}
}
}
We can call `insert` to build up the `BST` without worrying about breaking the search tree property. Let's build two different trees:
let tree1 = new BST();
tree1.insert(10);
tree1.insert(5);
tree1.insert(16);
tree1.insert(1);
tree1.insert(7);
tree1.insert(16);
let tree2 = new BST();
tree2.insert(1);
tree2.insert(5);
tree2.insert(7);
tree2.insert(10);
tree2.insert(16);
tree2.insert(16);
The insertions above will yield the following trees:
Are you cringing at tree2
? You should be. Although we have the same values in both trees, they display drastically different structures because of the insertion order we used. This is why we have been referring to our BST
implementation as naive. Both of these trees are Binary Search Trees, however not all BSTs are created equal. A worst case BST degenerates into a linked list. The "best" BSTs are height balanced, we'll explore this concept soon
Our goal is to implement a #search
method on our previous BST
class that will solve the problem:
Given a binary search tree and a target value, return a boolean indicating whether or not the target is
contained in the tree.
In other words, our BST#search
should satisfy the following behavior:
let tree = new BST();
tree.insert(10);
tree.insert(5);
tree.insert(16);
tree.insert(1);
tree.insert(7);
tree.insert(16);
tree.search(7); // => true
tree.search(16); // => true
tree.search(14); // => false
As with many tree problems, this problem lends itself nicely to recursion! Like always, our base case should capture the scenario where the input tree is trivial and we know the answer to the problem without further calculation. If the given tree is empty, then we can be certain that the target is not found in the tree. The logic of our BST#search
method will be much the same compared to our binarySearch
function for sorted arrays. Try to interpret the code below and scroll further to the annotated version when you need clarification
// assuming our BST class from the previous section
class BST {
//...some code here
search(val, root = this.root) {
if (!root) return false;
if (val < root.val) {
return this.search(val, root.left);
} else if (val > root.val) {
return this.search(val, root.right);
} else {
return true;
}
}
}
// assuming our BST class from the previous section
class BST {
//...some code here
// commented
search(val, root = this.root) {
// if the tree is empty, then the target val is not in the tree, so return false
if (!root) return false;
// otherwise the tree is not empty, so...
if (val < root.val) {
// if the target is less than the root,
// then search the left subtree
return this.search(val, root.left);
} else if (val > root.val) {
// if the target is greater than the root,
// then search the right subtree
return this.search(val, root.right);
} else {
// otherwise, the target must be equal to the root
// so return true since we found it!
return true;
}
}
}
Before we analyze the time complexity of BST#search
, we'll first need to learn about height balance. Recalling what we touched on briefly in our chat on binary trees, height is defined as the number of edges between the root and farthest leaf in a tree. Note that height is dictated by the farthest leaf (think worst case):
Following this definition, a tree consisting of a single node has height 0. We consider then an empty tree as having height -1. Height is relevant because not all BSTs are created equal! That is, some BSTs have "good / small" heights, others have "bad / large" heights. Take a look at these two BSTs containing identical values, but very different heights:
Tree 1
is preferred over Tree 2
, because Tree 1
is balanced. Balanced Binary Trees will be the most efficient to perform operations on.
For a binary tree to be balanced:
Notice that balanced has a recursive definition. Like you probably guessed, the empty tree is considered balanced. This will be the base case of our definition.
A balanced binary tree is incredible to have because it's height is guaranteed to be O(log2(n)), where n is the number of nodes in the tree. Let's take a look at a few examples:
To make the approximations above, we rounded the result of each log down to the nearest integer. If you are not convinced of how powerful this is, this means that a balanced tree of 1000 nodes will have a height of just 10.
Worst case for the algorithm occurs when the target value is not present in the tree. This means that we must traverse a path from root to a leaf, so we must travel the full height of the tree in the worst case. However, like we discussed, the height of a tree can vary wildly. We can have a tree with minimal height (a balanced tree like Tree 1
), or we can have a tree with maximal height (a linked list like Tree 2
).
No additional space is needed for the algorithm, so we have constant O(1) space.
To play devil's advocate, what if we count the recursive stack calls as contributing to the space complexity? Some coding challenges in your job hunt may pose this. If that is the case then our recursive implementation above will use:
Let's add two more tree traversal algorithms to our arsenal. Depth-First and Breadth-First are two classic traversal strategies that differ in the order nodes are hit. In this reading, our candidate tree will be:
Like we are accustomed to, we can represent the tree programmatically with:
class TreeNode {
constructor(val) {
this.val = val;
this.left = null;
this.right = null;
}
}
let a = new TreeNode('a');
let b = new TreeNode('b');
let c = new TreeNode('c');
let d = new TreeNode('d');
let e = new TreeNode('e');
let f = new TreeNode('f');
a.left = b;
a.right = c;
b.left = d;
b.right = e;
c.right = f;
To help verbalize Depth-First (DF), we'll be using a few familial terms to describe the relative positions of the nodes. Think of the words you would use if viewing a family tree! Here are some examples:
B
and C
are siblingsD
and E
are descendants of B
B
, C
, D
, E
, F
are all descendants of A
A Depth-First traversal will continually travel deeper into a tree before switching branches. This means that, given a node, we must visit all of it's descendants before visiting it's sibling.
Performing DF on our tree will hit the nodes in the order: A, B, D, E, C, F
To travel the nodes of a tree according to Depth-First behavior, we'll utilize a stack. Recall from earlier that a stack is LIFO (Last In, First Out). Our strategy is to use an array as a stack. We'll use push
to add to the top of our stack and pop
to remove the top. Below is a complete implementation of depthFirst
. Try to interpret the code below and scroll further to see the annotated version:
function depthFirst(root) {
let stack = [ root ];
while (stack.length) {
let node = stack.pop();
console.log(node.val);
if (node.right) stack.push(node.right);
if (node.left) stack.push(node.left);
}
}
```js
function depthFirst(root) {
// initialize the stack with the root node
let stack = [ root ];
// continue running the algorithm while there are still nodes on the stack
while (stack.length) {
// pop the top node from the stack
let node = stack.pop();
// we consider a node visited once we pop it,
// so we should print the node's value now
console.log(node.val);
// add the node's left and right children, if they exist
if (node.right) stack.push(node.right);
if (node.left) stack.push(node.left);
// IMPORTANT: do not print out the children yet; they must wait their turn to be popped first
}
}
You should watch the video lecture that follows this reading for a visual on how a stack inherently gives us DF order. For now, a key idea to take away is that we only consider a node "visited" once we pop it. We do not consider a node "visited" when we push it.
Because a stack naturally leads to DF order on a tree, we can easily write a recursive version. Why is recursion relevant to DF? Recursion utilizies the call stack:
function depthFirstRecur(root) {
if (!root) return;
console.log(root.val);
depthFirstRecur(root.left);
depthFirstRecur(root.right);
}
Does this code look familiar? It's identical to the preOrderPrint
function we wrote previously. That's right, pre-order and depth-first are identical tree node orderings.
You should study both the iterative and recursive implementations as they will both prove valuable to solving problems.
This algorithm has nothing to do with bread. The word "breadth" is the same as "width". To help veribalize Breadth-First (BF) we'll need to understand the simple concept of tree levels. With the tree at the top of this reading in mind, we can say the following:
A
B
, C
D
, E
, F
A Breadth-First traversal will visit all nodes across a level, before moving to the next level. This means we travel laterally as much as we can before going deeper into the tree.
Perform BF on our tree will hit the nodes in the order: A, B, C, D, E, F
While DF uses a stack, BF will use a queue. Recall that a queue is FIFO (First In, First Out). The code is very similar to our iterative DF, except we will use an array as a queue. shift
will remove the front of the queue and push
will add to the back of the queue. Interpret the implementation below and scroll further to the annotated version when you need more insight:
function breadthFirst(root) {
let queue = [ root ];
while (queue.length) {
let node = queue.shift();
console.log(node.val);
if (node.left) queue.push(node.left);
if (node.right) queue.push(node.right);
}
}
```js
function breadthFirst(root) {
// initialize the queue with the root node
let queue = [ root ];
// continue running the algorithm while there are still nodes on the queue
while (queue.length) {
// remove the front node from the queue
let node = queue.shift();
// the node we just removed is now "visited", so print it
console.log(node.val);
// add the left and right children to the back of the queue, if they exist
if (node.left) queue.push(node.left);
if (node.right) queue.push(node.right);
// IMPORTANT: do not print out the children yet; they must wait their turn to exit the front of the queue first
}
}
We'll rarely run into a recursive BF implementation (probably never) because recursion uses an underlying call stack, but we really want the opposite of a stack (a queue).
A graph is any collection of nodes and edges. In contrast to our previous trees, a graph is much more relaxed in it's structure. A graph may:
In this section, we will draw heavily from our tree algorithms. The adjustments we will make to those algorithms will be motivated by these core differences.
Below are a few examples of graphs that don't agree with our CompSci definition of a binary tree:
Here are some highlights:
Graph 1
lacks a root. This means there is no single node that can access all other nodes in a path through edges. This is important because we previously referenced "entire" trees by referring to the ultimate root. We can no longer do that in a graph. If we provide just T
, you can't access U
. If we provide just U
, you can't access T
. If we provide just V
, you can't access T
or U
.Graph 2
has a cycle. This means there is no longer a parent-child relationship. Choose any node in Graph 2
, its grandchild will also be its parent. Wait - what? From now on we'll have to use less specific language such as "X
is a neighbor of Y
." Perhaps even more deadly, imagine we ran a "simple" Depth-First traversal on this graph. We could get trapped in an infinite loop if we are not careful.Graph 3
features nodes that have more than 2 edges. Anarchy!There are many ways to represent a graph programmatically. Let's take a moment to explore each and describe the tradeoffs we make when choosing among them. We will use Graph 3
from above as our candidate. Bear in mind that our graph is directed. For example, this means that C
can access D
, but D
cannot access C
.
This implementation is most similar to how we implemented binary trees. That is, we create a node class that maintains a value and an array of references to neighboring nodes. This easily solves the problem that a node can have any number of neighbors, no longer just a left and right.
class GraphNode {
constructor(val) {
this.val = val;
this.neighbors = [];
}
}
let a = new GraphNode("a");
let b = new GraphNode("b");
let c = new GraphNode("c");
let d = new GraphNode("d");
let e = new GraphNode("e");
let f = new GraphNode("f");
a.neighbors = [b, c, e];
c.neighbors = [b, d];
e.neighbors = [a];
f.neighbors = [e];
This implementation is great because it feels familiar to how we implemented trees. However, this implementation is clunky in that we have no easy way to refer to the entire graph. How can we pass this graph to a function? Recall that there is no root to act as the definite starting point.
This is the often the mathematician's preferred way of representing a graph. We use a 2D array to represent edges. We'll first map each node's value to an index. This means A -> 0
, B -> 1
, C -> 2
, etc.. Below is the mapping for Graph 3
:
From here, the row index will correspond to the source of an edge and the column index will correspond to its destination. A value of true
will mean that there does exist an edge from source to destination.
let matrix = [
/* A B C D E F */
/*A*/ [true, true, true, false, true, false],
/*B*/ [false, true, false, false, false, false],
/*C*/ [false, true, true, true, false, false],
/*D*/ [false, false, false, true, false, false],
/*E*/ [true, false, false, false, true, false],
/*F*/ [false, false, false, false, true, true]
];
A few things to note about using an adjacency matrix:
matrix[i][j]
may not be the same as matrix[j][i]
matrix[x][x] === true
for any x
An advantage of the matrix implementation is that it allows us to refer to the entire graph by simply referring to the 2D array. A huge disadvantage of using a matrix is the space required. To represent a graph of n nodes, we must allocate n2 space for the 2D array. This is even more upsetting when there are few edges in graph. We will have to use n2 space, even though the array would be sparse with only a few true
elements.
An adjacency list seeks to solve the shortcomings of the matrix implementation. We use an object where keys represent the node labels. The values associated with the keys will be an array containing all adjacent nodes:
let graph = {
'a': ['b', 'c', 'e'],
'b': [],
'c': ['b', 'd'],
'd': [],
'e': ['a'],
'f': ['e']
};
An adjacency list is easy to implement and allows us to refer to the entire graph by simply referencing the object. The space required for an adjacency list is the number of edges in the graph. Since there will be at most n2 edges in a graph of n nodes, the adjacency list will use at most the same amount of space as the matrix. You'll find adjacency lists useful when attacking problems that are not explicitly about graphs. We'll elaborate more on this soon.
Let's explore our classic Depth-First, but for graphs this time! We'll be utilizing the GraphNode
and Adjacency List
implementations of the following graph:
Since we already discussed the differences between Depth-First and Breadth-First, we'll focus just on Depth-First here. We'll leave the Breadth-First exploration in the upcoming project.
Let's begin by assuming we have our candidate graph implemented using our GraphNode
class:
class GraphNode {
constructor(val) {
this.val = val;
this.neighbors = [];
}
}
let a = new GraphNode("a");
let b = new GraphNode("b");
let c = new GraphNode("c");
let d = new GraphNode("d");
let e = new GraphNode("e");
let f = new GraphNode("f");
a.neighbors = [e, c, b];
c.neighbors = [b, d];
e.neighbors = [a];
f.neighbors = [e];
One thing we'll have to decide on is what node to begin our traversal. Depending on the structure of the graph, there may not be a suitable starting point. Remember that a graph may not have a "root". However in our candidate, F
is like a root. It is the only valid choice because it is the only node that may access all other nodes through some path of edges. We admit, the choice of F
is somewhat contrived and in a practical setting you may not have a nice starting point like this. We'll cover how to overcome this obstacle soon. For now we'll take F
.
We want to build a recursive depthFirstRecur
function that accepts a node and performs a Depth-First traversal through the graph. Let's begin with a baseline solution, although it is not yet complete to handle all graphs: // broken
function depthFirstRecur(node) {
console.log(node.val);
node.neighbors.forEach((neighbor) => {
depthFirstRecur(neighbor);
});
}
depthFirstRecur(f);
Can you see where this code goes wrong? It will get caught in an infinite cycle `f, e, a, e, a, e, a, e, ...` ! To fix this, simply store which nodes we have visited already. Whenever we hit a node that has previously been visited, then return early. We'll use JavaScript [Sets](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set) to store `visited` because they allow for constant time lookup.
// using GraphNode representation
```js
function depthFirstRecur(node, visited=new Set()) {
// if this node has already been visited, then return early
if (visited.has(node.val)) return;
// otherwise it hasn't yet been visited,
// so print it's val and mark it as visited.
console.log(node.val);
visited.add(node.val);
// then explore each of its neighbors
node.neighbors.forEach(neighbor => {
depthFirstRecur(neighbor, visited);
});
}
depthFirstRecur(f);
This code works well and will print the values in the order f, e, a, c, b, d
. Note that this strategy only works if the values are guaranteed to be unique.
If you are averse to recursion (don't be), we can write an iterative version using the same principles:
function depthFirstIter(node) {
let visited = new Set();
let stack = [node];
while (stack.length) {
let node = stack.pop();
// if this node has already been visited, then skip this node
if (visited.has(node.val)) continue;
// otherwise it hasn't yet been visited,
// so print it's val and mark it as visited.
console.log(node.val);
visited.add(node.val);
// then add its neighbors to the stack to be explored
stack.push(...node.neighbors);
}
}
depthFirstIter(f);
Let's now assume our candidate graph in the form of an Adjacency List:
let graph = {
'a': ['b', 'c', 'e'],
'b': [],
'c': ['b', 'd'],
'd': [],
'e': ['a'],
'f': ['e']
};
Bear in mind that the nodes are just strings now, not GraphNode
s. Other than that, the code shares many details from our previous implementations:
// using Adjacency List representation
function depthFirstRecur(node, graph, visited = new Set()) {
if (visited.has(node)) return;
console.log(node);
visited.add(node);
graph[node].forEach((neighbor) => {
depthFirstRecur(neighbor, graph, visited);
});
}
depthFirstRecur('f', graph);
Cool! We print values in the order f, e, a, b, c, d
. We'll leave the iterative version to you as an exercise for later.
Instead, let's draw our attention to a point from before: having to choose f
as the starting point isn't dynamic enough to be impressive. Also, if we choose a poor initial node, some nodes may be unreachable. For example, choosing a
as the starting point with a call to depthFirstRecur('a', graph)
will only print a, b, c, d, e
. We missed out on f
. Bummer.
We can fix this. A big advantage of using an Adjacency List is that it contains the full graph! We can use a surrounding loop to allow our traversal to jump between disconnected regions of the graph. Refactoring our code:
function depthFirst(graph) {
let visited = new Set();
for (let node in graph) {
_depthFirstRecur(node, graph, visited);
}
}
```js
function _depthFirstRecur(node, graph, visited) {
if (visited.has(node)) return;
console.log(node);
visited.add(node);
graph[node].forEach(neighbor => {
_depthFirstRecur(neighbor, graph, visited);
});
}
depthFirst(graph);
Notice that our main function depthFirst
is iterative and accepts the entire Adjacency List as an arg. Our helper _depthFirstRecur
is recursive. _depthFirstRecur
serves the same job as before, it will explore a full connected region in a graph. The main depthFirst
method will allow us to "bridge" the gap between connection regions.
Still fuzzy? Imagine we had the following graph. Before you ask, these are not two separate graphs. This is a single graph that contains two connected components. Another term for a graph of this structure is a "Forest" because it contains multiple "Trees", ha:
It is easy to represent this graph using an Adjacency List. We can then pass the graph into our depthFirst
from above:
let graph = {
'h': ['i', 'j'],
'i': [],
'j': ['k'],
'k': [],
'l': ['m'],
'm': []
}
depthFirst(graph);
// prints h, i, j, k, l, m
Here's the description for how `depthFirst` operates above. We enter `depthFirst` and the for loop begins on `h` . This means we enter our `_depthFirstRecur` , which will continue to explore the "local" region as far as possible. When this recursion ends, we would have explored the entire connected region of `h, i, j, k` (note that we add these nodes to visited as well). Our recursive call then returns to the main `depthFirst` function, where we continue the for loop. We iterate it until we hit an unvisited node ( `l` ) and then explore it's local region as far as possible using `_depthFirstRecur` , hitting the last node `m` .
The objective of this lesson is for you to understand the different parts of the Internet Protocols. This lesson is relevant because knowledge of internet protocols is expected for all Software Engineers that write code that connects to a network. Moreover, internet protocols are popular interviewing topics.
When you complete this content, you should be able to do the following.
When you complete this content, you should be able to do the following.
The objective of this lesson is for you to have a basic understanding of
commonly-used network analysis utilities. This lesson is relevant because
interacting with networks via different tools is essential for every software
developer on the web. When you are done, you should be able to use traceroute
to show routes between your computer and other computers. You should also be
able to use Wireshark
to show inspect network traffic.erstanding of
commonly-used network analysis utilities. This lesson is relevant because
interacting with networks via different tools is essential for every software
developer on the web. When you are done, you should be able to use traceroute
to show routes between your computer and other computers. You should also be
able to use Wireshark
to show inspect network traffic.
One challenge with mental models is that everyone thinks about things differently! While we've discussed the TCP/IP reference model at length, we haven't introduced any others. Let's take a look at one other very well-known reference model for networks: the OSI Model.
We'll cover:
Around the same time computer scientists in the United States were hammering out the layers of the TCP/IP reference model, a similar discussion was happening a world away in the United Kingdom. Researchers in the UK decided that a clear reference model needed to be available to others worldwide, so they began working with the International Standards Organization (ISO). The ISO published a document called The Basic Reference Model for Open Systems Interconnection (or ISO 7498), including a seven layer reference model for networking, in the early 1980s.
The Open Systems Interconnection (OSI) reference model differs from the TCP/IP model by its focus on standardization. The TCP/IP model is mostly focused on practical networking concepts and isn't tightly tied to particular protocols (other than those for which it's named). The OSI model, however, has both conceptual layers and suggested protocols for each. This idea was well-intentioned: make these protocols the standard so that computer scientists have less to think about! This standardization could help prevent vendor lock-in as well, since all major vendors would (hopefully) follow the standards.
Here's an overview of the seven layers of the OSI model, along with some well-known protocols for each layer:
Let's dig into each layer, starting from the top.
The OSI Application Layer includes information used by client-side software. Data transmitted on this layer will interact directly with applications, as the name suggests, and can be displayed to the user with limited translation. HTTP is an example of a common Application Layer protocol.
The OSI Presentation Layer is where data gets translated into a presentable format. This is often called the syntax layer since data is converted between machine-readable & human-readable syntax here as well. As a result, the Presentation Layer may include data compression, encryption, and character encoding. Many image formats, including JPEG and GIF, use well-known Presentation Layer protocols.
The OSI Session Layer includes protocols responsible for authentication and data continuity. Session Layer protocols may authorize a client with the server or re-establish a dropped connection. An example protocol you may find on this layer is RPC (Remote Procedure Call), a mechanism for one device to initiate a command on another.
Now we're on familiar turf! The OSI Transport Layer, like the layer of the same name in the TCP/IP reference model, utilizes transport protocols. Processes here are focused on data integrity and connectivity. Our old friends TCP and UDP are the two most-used transport protocols.
The OSI Network Layer mirrors TCP/IP's Internet Layer. This layer manages connections between remote networks, transferring packets across intermediary devices. The best-known protocol at the Network Layer is IP.
Protocols at the OSI Data Link Layer deal with connections directly from one machine's network interface to another. Frames targeting different MAC addresses are transferred here, and the Data Link Layer is primarily used by machines in local networks. The most recognizable protocol on this layer is Ethernet.
OSI's Physical Layer goes a little deeper than the TCP/IP reference model. Physical Layer protocols have to do with translating from raw electrical signals to bits & bytes of data. You may recognize Wi-Fi (technically known as 802.11) and DSL as common Physical Layer protocols.
That's a lot of layers! It can be a little overwhelming to think of networks from the two complementary but differing perspectives of the TCP/IP and OSI reference models. Let's discuss when we might want to use each.
The OSI model is conceptual, meaning its practical uses are limited. We can see this when we look at protocols that cross layers. For example, HTTP primarily works on the OSI Application Layer, but includes the ability to manage character encoding, a Presentation Layer concern. Uh-oh! This makes OSI good for understanding concepts, but too restrictive for building new protocols.
The TCP/IP reference model, on the other hand, is almost purely practical. It was extracted from real, functional networks used by DARPA in the 1970s. Instead of concerns with minutiae like signal-to-data conversions, TCP/IP focuses on the core of networking: getting data from one place to another. For this reason, it's most often used when building new systems or analyzing real networks.
Of course, two popular models means most engineers will flip-flop between them! You'll often hear both models used in the same conversation. We'll discuss some techniques to differentiate between these two models in an upcoming lesson.
We've examined two ways of thinking about network design & functionality: first, the TCP/IP reference model, and now the OSI model. Next up, we'll compare these two models in greater detail.
After this lesson, you should feel comfortable:
We've investigated TCP/IP in great detail, and we've seen how broad a scope it covers. Now let's step back and think about the whole networking process. We're breaking the TCP/IP stack down and categorizing our protocols - are you ready?
We'll cover:
Remember that when TCP/IP was first being crafted, researchers felt it was too large and separated the Transmission Control Protocol from the Internet Protocol. This separation was a boon; it made the protocols easier to implement individually and led to the Internet we know and love today!
We sometimes refer to this approach as separation of concerns. We divide up complex processes so that many connected concepts can work independently. This makes it easier to consider each concept in detail on its own and means each concept can grow at its own unique pace.
The developers of TCP/IP took this separation even farther in 1989 when they published RFC 1122. This spec, also titled "Requirements for Internet Hosts -- Communication Layers", provided a new way of thinking about the whole TCP/IP process. According to the RFC, we can separate the connection out into four distinct layers, or separate areas of interest. These are:
We refer to this as a reference model: a high-level overview of a complex topic provided by an organization that manages it. The four-layer model presented in RFC 112 is often called the TCP/IP reference model, simply TCP/IP model, or even the Department of Defense (DoD) model, referring to the original research being done at DARPA.
Here's a visual summary of the four layers of the TCP/IP reference model, along with some well-known protocols for each layer:
Let's look at what each layer of the reference model includes.
The Application Layer includes protocols related to user-facing data. Some of the protocols that's we've discussed, like HTTP and FTP, operate in this layer. The TCP/IP model doesn't care what type of application data is used; whatever is transmitted from the Transport Layer is considered Application Layer data.
The Transport Layer includes (you guessed it) transport protocols! We've already discussed the two best-known: TCP and UDP. This layer focuses on connectivity between clients and servers, and relies on the lower layers to establish network connectivity.
The Internet Layer is where IP lives. Data is processed in packets on this layer, and routing is primarily handled with IP addresses. The Internet layer focuses mostly on connecting separate networks together.
The Link Layer includes our lower-level communication standards. Link Layer protocols aren't concerned with the type of data being transported, but instead focus on getting data from one local network resource to another. We jump up to the Internet layer when dealing with resources on other networks.
Fifth layer?
Despite the RFC specifying four layers for the TCP/IP model, you may encounter resources detailing a five layer model for TCP/IP instead! The "fifth layer" is usually the Physical layer. This helps us separate electrical concepts like transmission across wires from data-oriented concepts like MAC addresses, but isn't an official reference model from the IETF. TCP/IP doesn't explicitly include any physical mediums, so thinking of this as a fifth layer can be helpful.
Layers provide a mental model we can use to think about how interactions across networks occur. It's important to remember that these are "best fit" models, though: they translate loosely to our actual data. Some protocols may cross layers, and some companies will adjust these models to fit their own internal implementations. Ultimately, these layers provide a form of shared communication between professionals. You can count on another engineer understanding what an "Application Layer issue" means, even if your own definitions differ slightly!
We often refer to encapsulation when describing how layers map to our data. This means higher layers are encapsulated, or wrapped, in lower layers. For example, a Transport Layer segment includes Application Layer data in its payload, and a Link Layer frame includes the whole stack! Here's an example:
As we'll see, alternative networking reference models may define layers as beginning/ending at different points in our data. However, the general idea is shared across models: lower layer data units include data for higher layers in their payloads.
When it comes to technical concepts, it's "reference models all the way down"! These new ways of thinking about topics we've already explored will help you communicate the concepts more clearly, and help you navigate problems that may be deeper than your own code.
You should feel confident:
As we study networking we are going to be investigating a very low level of computing compared to the JavaScript programming we've been doing so far. We are moving closer to hardware, and so we will be exposed to some new numbers, formatted as binary or hexadecimal.
At the lowest level, a computer just speaks two values, 1
and 0
. These are
usually represented by two different voltages inside an integrated circuit known
as a CPU (Central Processing Unit). Everything that you do on a computer,
writing and running your JavaScript code, watching videos online, chatting or
posting on social media, or playing a video game, are at a fundamental level
just 1's and 0's being processed by an integrated circuit.
So, what is binary? To understand this, we need to talk about bases.
What's a base? It turns out there's not just one number system. The number system human beings have used for thousands of years is base 10. This means we count using the Arabic digits 0 through 9. The reason we use base 10? It should be pretty obvious. The majority of human beings have ten fingers. So naturally we invented a counting system that used 10 digits. If we ever meet aliens from another planet that only have say, six digits, they might indeed use base 6.
As it turns out, the base that computers speak is base 2. This derives directly from the fact that transistors (which is what all integrated circuits are made of) have two voltage states. You can use base 2 to perform mathematical calculations and because of this all computing at a fundamental level is base 2.
So what does base 2 look like? Well if base 10 contains the digits 0 through 9, then base 2 contains the digits 0 through 1.
If you remember from your early math education, decimal (another word for base 10) numbers can be divided into places.
For example, given the number 42, the number breaks down into the following places.
Place | 1000 | 100 | 10 | 1 |
---|---|---|---|---|
Digit | 0 | 0 | 4 | 2 |
(4 * 10) + (2 * 1) = 42
For binary, we instead have the following places, and the number 42 breaks down this way:
Place | 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 |
---|---|---|---|---|---|---|---|---|
Digit | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |
(128 * 0) + (64 * 0) + (32 * 1) + (16 * 0) + (8 * 1) + (4 * 0) + (2 * 1) = 42
Or since we can simplify the zeros to 0 and the ones to 1, we can calculate this much more simply by just adding up the places that contain a 1.
32 + 8 + 2 = 42
This is a good shorthand way of calculating a binary number in your head as long as you memorize the bases.
So inside computers we often call a single digit a bit. A bit can be either on (1) or off (0).
A sequence of 8 bits is known as a byte.
So our 42 example is a single byte since it contained 8 bits:
00101010
There are also some multiples of bytes computer science borrowed from the metric system, although with confusing results since the metric system is Base 10, while computing is Base 2.
Unit | Value |
---|---|
Kilobyte | 1000 bytes |
Megabyte | 10002 bytes |
Gigabyte | 10003 bytes |
Terabyte | 10004 bytes |
Petabyte | 10005 bytes |
Exabyte | 10006 bytes |
Zettabyte | 10007 bytes |
Yottabyte | 10008 bytes |
This system worked fine until computer storage got very large. Manufacturers of hard drives would use Base 10, while Operating Systems would often use Base 2. The discrepancy between something like the gigabyte in base 2 vs base 10 was very large.
1 Gigabyte in base 10 = 1,000,000,000 bytes 1 Gigabyte in base 2 = 1,073,741,824 bytes
That's 73.7 Megabytes of difference! With a Terabyte it got even worse.
1 Terabyte in base 10 = 1,000,000,000,000 bytes 1 Terabyte in base 2 = 1,099,511,627,776 bytes
For a whopping 99.5 Gigabytes of difference.
Something had to be done. So now we have two different sets of terminology one for Base 10 and one for Base 2.
Base 10 | Abbr | Value | Base 2 | Abbr | Value |
---|---|---|---|---|---|
Kilobyte | kB | 1000 bytes | Kibibyte | KiB | 1024 bytes |
Megabyte | MB | 10002 bytes | Mebibyte | MiB | 10242 bytes |
Gigabyte | GB | 10003 bytes | Gibibyte | GiB | 10243 bytes |
Terabyte | TB | 10004 bytes | Tebibyte | TiB | 10244 bytes |
Petabyte | PB | 10005 bytes | Pibibyte | PiB | 10245 bytes |
Exabyte | EB | 10006 bytes | Exbibyte | EiB | 10246 bytes |
Zettabyte | ZB | 10007 bytes | Zebibyte | ZiB | 10247 bytes |
Yottabyte | YB | 10008 bytes | Yobibyte | YiB | 10248 bytes |
Still to this day, you will hear people refer to the base 2 versions as Kilobyte or Megabyte. Often it's hard to determine what unit is being used when manufacturers advertise the size of hard drives or memory. Worse, Operating Systems often display inconsistent numbers throughout their many displays of how big disks or files are.
Another useful base in computing is Base 16 also known as hexadecimal.
Why is this useful? This is use because hexadecimal can provide a shorter, more human-readable version of binary.
So if base 10 goes from the digits 0 through 9, what are we going to do? There aren't 16 digits...
The letters A through F are here to rescue us from this. The available digits
for hexadecimal are 0 through F, where A
is 10
decimal and F
is 15
decimal.
hexadecimal: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F
decimal: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
It's hard to think of letters as being numbers, but in hexadecimal it's perfectly normal.
So how does this help us write binary numbers in a shorter form? It's because there are 8 bits in a byte, which means one byte can be expressed as two hexadecimal digits.
Let's look at the places for hexadecimal for the decimal number 42. There's a sixteenth's place and a one's place in this example.
Place | 16 | 1 |
---|---|---|
Digit | 2 | A |
This might seem confusing at first but just like our other base examples, this is:
(16 * 2) + (A * 1)
However, since A
is really 10
in decimal this resolves to this in decimal.
(16 * 2) + (10 * 1) = 42
Using this you can see that the maximum value for a two digit hexadecimal number
is FF
, which is 255
in decimal and 11111111
in binary.
This happens to be the maximum value of one byte.
FF = 255 = 11111111 = 1 byte
So instead of typing out an entire string of 1s and 0s we can write bytes as a sequence of 2 digit hexadecimal numbers. We usually separate these numbers by spaces or some other delimiter to make it clear they are a sequence of numbers:
For instance this sequence of numbers in decimal:
4 8 15 16 23 42
is the following in binary:
00000100 00001000 00001111 00010000 00010111 00101010
But in Hexadecimal it's only this:
04 08 0F 10 17 2A
I hope you can see that this is a really compact and convenient way to represent binary numbers. The numbers are always the same length, which is good for data storage, and they can be easily translated back to binary. It might also help rescue you from Mars.
0x
NotationSometimes you will see hexadecimal numbers represented by prepending a 0x
to
the front of them. So our 42 would be expressed like this:
0x2A
In JavaScript, we can use toString
on Number
objects to convert different
bases to decimal. We can supply a base as an argument to toString
.
Number(42).toString(16) // 2a
Number(42).toString(2) // 101010
We can also use our old friend parseInt
with an optional second argument to
convert a binary or hexadecimal string to a decimal number.
parseInt('101010', 2) // 42
parseInt('2A', 16) // 42
Binary and hexadecimal are often used in computing because computers are fundamentally base 2. As you move closer to the hardware, you encounter these more and more often. In networking, you will see them used for IP Addresses and MAC Addresses. We'll learn more about that when we look next at how IP Networking works in the next section.
As we dive deeper into understanding how computers communicate, a common question keeps coming up: what is "the Internet", exactly? To answer this, let's discuss the Internet Protocol, also known as IP.
We'll cover:
To understand where we are today, we need to look back to where we came from. Picture yourself in the United States in the late 1960s. The country is nearing the end of the "space race" and technology is booming. There are numerous technical teams and physical networks created as a result of recent research, but communication between them is limited. There's also a rise of different proprietary standards which are hampering growth: not every team can afford a separate terminal of each available type! How can we facilitate better collaboration with less investment required?
By 1974, two researchers working for DARPA think they have the answer. They propose something called the Transmission Control Program. It's a complex process that defines exactly how multiple networks can communicate with each other. This protocol stands out because it is:
These highlights are critical because DARPA is a military organization. They're looking for technology that could theoretically withstand a nuclear attack - and the Transmission Control Program fits the bill!
It became quickly apparent that that Transmission Control Program was too dense. The process was complicated and involved many moving parts, and other engineers raised concerns that it should be extracted into separate parts. Soon, the Transmission Control Program was divided into two separate sections: Transmission Control Protocol (TCP), which was responsible for the fault-tolerance of joined networks, and Internet Protocol (IP), which was responsible for the end-to-end nature of joined networks.
The protocols we use today have been improved over time, but still carry those names and general purposes. It's amazing to think that modern social media, video gaming, and streaming content is all dependent on 50+ year old technology!
The Internet can be loosely defined as "a series of internetworked systems". Here's a practical example of what we mean by that:
Imagine every building in your town has a parking garage. Each garage is owned by a different company and uses custom entry sensors. These sensors prefer certain vehicle types: one garage for luxury sports cars only, another that will only allow motorcycles. You may need to park in the garage for the vehicle you own, then walk a lot. Otherwise, you'd have to own a different vehicle for each garage! Oh no! We'd consider this an "isolated" model: each garage works fine by itself, but patrons of one garage may not park elsewhere, and none of the garages are really meeting their full potential.
Now imagine the garages remove their sensors. Suddenly you can park anywhere you'd like. Your motorcycle and your neighbor's school bus can exist harmoniously in any garage, and you can travel from garage to garage in any vehicle you'd like. We'd consider this an "internetworked" model: each garage's patrons may travel to any other garage and while the particular entrance/exit policies may differ, drivers can rest assured knowing they'll fit anywhere.
The Internet Protocol opened the door for this internetworked model in computing. Now a network in New York City using one vendor's computers could seamlessly communicate with a network in London from a different vendor! This connectivity led to the birth of the Internet, which is itself a series of interconnected networks sharing data.
IP data is transmitted in a format known as a packet. A packet uses a data format we've seen before: metadata in headers, and a body with content. The headers are used to get the packet to its destination, while the body contains the information we'd like to transfer.
We refer to IP's communication style as packet-switching. This is when a message is split up into separate "packets", delivered to a destination, and reassembled as appropriate. Remember that IP's primary responsibility as part of the Internet's "double threat", TCP/IP, is maintaining an end-to-end state. For this reason, IP isn't concerned about whether packets are received by the client in sequential order, and may sometimes even lose packets altogether while in transit!
A crash course in bits & bytes
Most of the protocols we'll cover measure their data in bits. We represent a bit as a single binary digit, either 1 or 0. Since data gets long, we have some larger units available as well! We'll sometimes describe data sizes in bytes. A byte is eight bits.
For example, "01001" is five bits, and a piece of data that's 32 bits long could also be described as "4 bytes".
The Internet Protocol has evolved over time, but two versions stand out as the most used & important to us: version 4 and version 6. We often refer to these as IPvX, where X, is the version number.
The best known version of the Internet Protocol is IPv4. This version was used when TCP/IP was finalized by DARPA in 1983, and it's still the most-used protocol version online.
An IPv4 packet's header consists of at least 13 fields, or sequences of binary
bits. These fields start with a version identifier (0100
, or "4" in binary),
continue with 10 sequences that define things like the length of the header and
protocol type contained in the body, and wrap up with source and destination
addresses for the packet. IPv4 also includes an allowance for a 14th optional
header field called "Options" that can contain extra metadata about the packet's
content, but it's not often used. An IPv4 header without options will be 20
bytes (160 bits).
It's hard to visualize a header since it's essentially just a long string of 1s and 0s. Instead of trying to cram it all into one line to study, we can split it into specified widths and stack them. Here's a stacked diagram of the IPv4 header:
IPv4 addresses are composed of 4 octets, or 8-bit binary numbers. We usually represent them like this:
192.18.1.1
This is the same as 11000000.00010010.00000001.00000001
in binary notation,
but that's a lot harder to read! IPv4 supports around 4 billion unique
addresses.
The "4 billion unique address" limit seemed almost infinite in the earliest days of the Internet, but today it's easy to see how we might use up that few addresses! Seeing this address exhaustion on the horizon, Internet researchers began concocting a new protocol version, one that would allow more addresses, in the mid 1990's. By 2017, the new protocol was an official standard: IPv6.
IPv6 uses a totally different packet header format than IPv4, though they share a few fields. It only uses 8 header fields, and supports optional "extension headers" that come after these 8 fields, as opposed to IPv4's large "Options" block.
The 8 fields IPv6 uses, in order, are:
0110
, or "6" in binary notationThese headers have a fixed length of 40 bytes (320 bits).
Notice that IPv6 packets have fewer headers, but are double the length of IPv4 headers! This is primarily due to IPv6 addressing, which allows dramatically more address space than IPv4. How many more addresses?
IPv4 supports ~4 billion (4x109) addresses.
IPv6 supports ~350 undecillion (3.5×1038) addresses.
That's a billion times a billion more addresses! It's even more addresses than grains of sand in all the world's beaches & deserts (7.5x1018, according to the University of Hawaii).
It handles this by quadrupling the number of bits in an address. IPv4 uses 32 bits, while IPv6 uses 128 bits. Remember that these bits are binary, so adding additional bits exponentially increases the number of permutations.
This new address space also required a new notation. Instead of the "four dotted decimal" notation of IPv4, IPv6 uses "eight colon-ed hexadecimal". Here's an example IPv6 address:
2600:6c5e:157f:d48c:138f:e0ba:6fa7:d859
The same address in binary is:
0010011000000000:0110110001011110:0001010101111111:1101010010001100:0001001110001111:1110000010111010:0110111110100111:1101100001011001
It's easy to see how we added so many extra addresses! That said, IPv6 is much more difficult to read by humans. You can read some neat rules for making IPv6 addresses easier to read on Wikipedia.
Both popular versions of the Internet Protocol include space for some special addresses that you should quickly recognize. The main one you'll encounter is called the loopback address. This is the identifier for your current machine. You'll see it repeatedly while developing because you'll navigate your browser there to access your own servers! You may also hear this referred to as localhost.
In IPv4 the loopback address is 127.0.0.1
. In IPv6, the loopback is ::1
.
There's also an "all interfaces" address. This address is used to catch any incoming requests regardless of intended destination. It's only used by the local machine: you would never send a packet to the "all interfaces" address, but a server that is listening on that address would see all incoming packets.
For IPv4, the "all interfaces" address is 0.0.0.0
. For IPv6, it's simply ::
.
Remember that the loopback and "all interfaces" address are not interchangeable!
This is a common misconception you may encounter in tutorials online, and might
be a trick question during a technical interview. If you're ever asked to
connect to localhost
, make sure you use the loopback.
Whew! The Internet Protocol is a dense topic, but a little familiarity will go a long way when you're debugging server connections. After reading this lesson, you should be able to:
Between HTTP and IP, we find an extra layer of information. We often refer to this as the transport layer of communication, and protocols used in it are referred to as transport protocols.
There are myriad of transport protocols available, but we're going to cover the two biggest: TCP and UDP. We'll dive into:
We've already briefly mentioned the Transmission Control Protocol (TCP) when discussing the history of the Internet Protocol (IP). Both TCP & IP made up the original Transmission Control Program developed at DARPA in the 1970s. We've dug deep into IP now, and we have some understanding of HTTP. Why do we need more protocols?
Let's provide a practical example. Think about the process of delivering a package. Floor pickers take your package from a warehouse onto the back of a truck, and a dispatcher sends that truck to your house. There's a place on your porch just waiting for that package. How, then, does your package make it across the very last leg of its journey? Whoops - we forgot the delivery person!
Transport protocols act as our "delivery person". IP is concerned with machine-to-machine communication, and HTTP is designed for application-to-application communication. Transport protocols bridge the gap and help our data cover the last mile between the network and software.
Like every other part of the internetworking process, transport protocols use their own unique form of addressing. We call these ports. Ports are virtual interfaces that allow a single device to host lots of different applications & services. By lots, we mean a whole bunch - there are 65536 separate ports available to each transport protocol!
Ports are represented by numbers: port 80
, port 51234
, etc. If we know both
the IP address and port we'd like to connect to, we can use a special notation
where both are joined by a colon:
192.168.1.1:8080
This would point to port 8080
on a network interface with an IP address of
192.168.1.1
. We refer to an IP address & port written together in this way as
a socket.
The most common transport protocol used is TCP. TCP is a connection-oriented protocol, meaning it establishes a connection between two sockets. This connection acts as safeguard from other error-prone protocols underneath it, including IP and Ethernet. Pieces of data sent via TCP (referred to as segments) respect a strict order and verify when they have been received. This means that data can't be "lost" across a TCP connection: if a segment is received out of order, the receiver will ask the transmitter to re-send the missing segment. This behavior makes TCP a reliable protocol.
We'll dive deeper into exactly how TCP verifies data & forms connections in a future lesson. For now, remember that "TCP == reliability". Any time it's critical that data arrives ordered and in full, TCP's the way to go! You'll see TCP used as the underlying connection for HTTP, file transfers, and media streaming. In all of these cases, missing data would result in corrupt files and unreadable data.
Because of everything TCP offers us, it's a relatively "heavy" protocol to use. Messages may take a bit longer to transfer than they would via other protocols, but you can be confident that your message has been received the way you intended it. This inherent slowness means applications using TCP may buffer data, or wait until a certain amount has been received before passing it to the user. You've probably seen this happen on your favorite video sharing sites!
The User Datagram Protocol (UDP) arrived on the scene a few years after TCP. Scientists working with TCP found that they sometimes didn't need all the order and reliability that TCP provided, and they were willing to trade that for raw speed. UDP is connection-less and provides no verification for whether data is received. Because of this, we refer to it as an unreliable protocol.
Hold on, though! By "unreliable", we certainly don't mean "useless". UDP is used in lots of familiar places: real-time video sharing, voice-over-IP phone calls, and DNS all rely on UDP as their transport protocol of choice. These services prioritize speed over reliability, so it makes sense that they would forego TCP's additional lag. If some data is lost along the way, that's okay - for example, you might just see lower-quality video for a moment.
Unreliable systems are valuable outside the world of transport protocols as well! Consider the postal service: most letters are sent without any sort of delivery confirmation or guarantee of arrival. It's up to the sender and/or recipient to manage expectations of when a letter ought to have arrived. This is similar to UDP. Data will be transmitted, and most will arrive, but if either side needs more reliability than that they will have to implement it themselves.
Transport protocols fill a gap in our current understanding of networks. They help us get data up from the network to our applications, and they give us a few options for fault-tolerance versus performance.
After reading this lesson, you should feel comfortable:
We've covered how connected devices communicate with each other, but we're missing a key piece: where humans fit into the equation! After all, the Internet would be a much more boring place if we had to remember the IP address of every website we chose to visit.
Let's look into the Domain Name System, a method of translating long numeric identifiers into friendly, human-readable addresses. We'll cover:
The Domain Name System (often just referred to as DNS) is a distributed approach to providing easily-understood names for internetworked devices. Practically, it's similar to a phone book: DNS allows us to look up a specific IP address by its domain.
In the early days of computer networking, connecting to another computer was a manual, complex process. A user would need very specific addresses to find the networked resource they were looking for, and those addresses were difficult to read/remember! A scientist named Elizabeth Feinler, working with ARPA in the early 1970's, saw a way to help. She started out with a simple text file listing computer names by their numeric addresses. This file grew in size as more systems joined ARPANET, and Elizabeth expanded her operation from a text file to a whole organization dedicated to keeping an up-to-date list of hostnames & IP addresses.
By the 1980's, it was clear that one organization wasn't enough to manage the growing Internet. The Domain Name System was invented as a way to distribute the work to numerous organizations, lightening the load and allowing much more rapid growth. The first DNS name server was written in 1984, and the rest is history!
DNS is one of the most important parts of allowing the Internet to grow so rapidly. It's perfect for quick scaling because it is both simple (relying on specifically-formatted text files) and distributed (redundant across numerous servers).
We've mentioned domains before today, but without much detail. Let's dive a little deeper into that term. A website's domain refers to the "friendly" name for the website's host, or the server providing the site's content. A domain differs from a URL in that the domain is only the server's identifier, not other application or protocol-related data in the URL.
Here's a breakdown of an average URL. We've highlighted the domain in green and labelled each part of the URL underneath:
A domain name can be split into a few parts:
The top-level domain (TLD) is the last part of the domain, appearing just
before the URL begins pointing at application routes (usually indicated with
/
's) or query parameters (indicated with a ?
and &
's). The best known
TLDs are .com
, .net
, and .org
. TLDs are managed by special organizations
that have demonstrated the ability to handle the immense workload involved,
often known as domain registries. These registries may be government
entities (for example, .gov
is managed by the General Services Agency)
or private companies that were awarded the privilege by ICANN.
To the left of the TLD, separated by a dot, is the second-level domain. You'll often hear the TLD & second-level domain lumped together as "the domain". This is the name most people associate with the website. Through domain registrars, consumers can purchase second-level domains. The registrar maintains a listing of each purchase.
Some websites will have additional domains to the left of the second-level
domain. These can be referred to by their formal names (third-level domain,
fourth-level domain, etc.) but are often informally referred to as
subdomains. The best-known subdomain is www
, though this is less-used on
newer sites. Subdomains can usually be freely created by the consumer.
DNS does one thing really well: identifying connected devices by friendly names. How exactly does this work?
It all goes back to the magic word: domain. Each individual domain is represented by a set of name servers, which store information about the domain's registered subdomains. Name servers will direct a client where they need to go - even if that's another name server! We refer to this process of working out which name server we need as resolution. Eventually, we'll reach a name server that can tell us the specific IP address for the full domain. We refer to this as the authoritative name server for our domain. It has the final say!
When trying to resolve a domain name, we start from the rightmost part (the TLD) and work our way to the left. We'll stop once we've reached an authoritative server that can give a direct address for the domain we're seeking. Intermediate servers should be able to point us to the most-relevant name server to continue our search (usually, the next domain to the left). We can think of this as a conversation between the client and the available name servers for our domain, each one moving us closer to our goal.
Here's a practical example of how DNS is used to discover the authoritative name
server for the fictional URL https://students.appacademy.io
:
Looking for something a little more whimsical? DNSimple has a fantastic webcomic detailing the journey of a domain resolver. Check it out!
We can see how DNS works, but what does it actually look like? It's not much different than it was at the very beginning! Each name server maintains a zone file: a text file containing host names, IP addresses, and resource types. Here's an example of a simple zone file:
$TTL 299
my-site.com. IN SOA ns1.cloudflare.com. dns.cloudflare.com. 2032032092 10000 2400 604800 3600
my-site.com. IN NS ns1.my-site.com.
my-site.com. IN NS ns2.my-site.com.
my-site.com. IN A 104.28.31.159
my-site.com. IN A 104.28.30.159
my-site.com. IN AAAA 2606:4700:30::681c:1f9f
my-site.com. IN AAAA 2606:4700:30::681c:1e9f
www IN CNAME my-site.com.
ns1 IN A 104.28.31.150
ns2 IN A 104.28.30.150
my-site.com. IN MX 10 mail.google.com.
Each line in a zone file includes the affected domain, type of record on that line, and the data for that record. Let's discuss some of the most common DNS record types in the order we see them above:
SOA
The SOA record represents the Start Of Authority. This record lets us know which name server is the master, or primary authority, for the domain we're querying. The SOA record is the minimum requirement in a zone file - every name server will return this record, if nothing else!
NS
NS records point to name servers for the zone. Most zones will have at least two NS records for redundancy. Remember that one of DNS's strengths is that it's distributed. If one name server loses power or becomes disconnected, we don't lose access to the zone.
A
/ AAAA
A records are the most important DNS records present. They map a resource
directly to an IP address. This is the core of what DNS is for: connecting the
domain directly to a machine. A
records are used for IPv4 addresses, while
AAAA
records perform the same function for IPv6.
CNAME
The CNAME record acts as an alias, linking one domain to another. In our
example above, we're saying that www.my-site.com
should point at the same
resource as my-site.com
. Notice that the www
doesn't have a .
after it.
This means it's a relative reference, and the additional parts of the domain
for this zone (my-site.com.
) are implied. When a domain in zone file ends in a
.
we can treat it as an absolute reference with no unwritten subdomains.
MX
DNS: It's not just for websites! MX records, short for Mail Exchanger, are used by e-mail clients to direct messages to the appropriate mail servers. These records let you send messages to "friend@gmail.com" instead of having to remember "friend@123.45.67.89"!
There's one piece we've overlooked in our example zone file above: the first
line. $TTL 299
refers to the Time to Live (TTL) for our records. This is a
measure of how long a record should be cached by a DNS name server.
We cache DNS queries because reading from a file can be slow! When a query comes in for a particular domain, the name server will cache the result in memory so that subsequent requests are much faster. However, this in-memory copy won't be updated if the zone file changes - yikes! The TTL lets us set how often a cached record should be discarded and read from the zone file again. This is especially important if we are pointing at a service where the IP might change frequently, like a local development environment or shared hosting service.
In our example, we've set the TTL for all records in the zone file to 299 seconds. This means that if your current web host goes offline and you have to point your domain at a new server, the downtime won't last more than approximately five minutes. This also means that you'll be re-checking your zone file for that domain at least once every five minutes. If you're confident in your hosting and aren't making infrastructure changes, longer TTLs can result in slightly increased performance.
The Domain Name System is a great example of a simple process (linking names to locations) evolving over time to support greater and greater needs. It's frightening to think of how difficult navigating around the Internet would be if we didn't have DNS to make websites easily accessible!
Before we move on, here's a quick tip: DNS questions are popular fodder for technical interviews. You may be asked to define a particular record type or to walk through a rough outline of what happens when you type a URL into your browser and click "Go". Try thinking through this process with your new knowledge!
After reading this lesson, you should have a better understanding of:
We've discussed a lot of data- and internal-communication protocols, but what supports these? Let's examine some of the most important hardware you'll see while examining computer networks!
We'll cover:
Network protocols mean very little if we don't have a physical way of connecting computers together! Whether it's via copper cables, fiber optics, or wireless networks, we need ways of managing communications to put those protocols into action. A quick search for "networking hardware" will yield a slew of results, but don't get overwhelmed! We can boil many of these devices down to three types: hubs, switches, and routers.
A hub is the simplest networking device you're likely to find in service. It performs no network management and might be better known as a "signal splitter". When a hub receives data, it duplicates that data and broadcasts it to all connected devices. That's it!
Hubs tend to be cheap and are often found in older networks. They are usually small metal boxes with a handful of physical connectors. You can get hubs with lots of connectors, but they're usually a little smaller - think 5 or 10 instead of 30 or 40. This is due to the natural limitations a hub possesses.
Heads up! We'll refer to the physical sockets that cables plug into as connectors, but you'll often hear them called ports instead. This can get very confusing, so be sure you're clear about the difference between virtual ports used by transport protocols and physical ports used by hardware. When in doubt, use a clearer term, like "connector" or "jack".
For one, a hub can't do any sort of filtering. This means every single data packet is sent to every single device, all the time. This creates a lot of unnecessary load on the network. Imagine if every time you called a friend, all of your other friends' phones rang too. Yikes! Additionally, hubs share bandwidth across devices, so heavy traffic can result in lower speeds. We'll sometimes see this problem on overloaded networks with other devices, but on a hub it's guaranteed.
Hubs were a helpful and necessary piece of hardware for a long time, but today there's little reason to use them. They may still be slightly cheaper than a switch, but the limitations outweigh most cost concerns. The best use for a hub now is as a temporary replacement while replacing a broken device.
A step up from the hub, we find a network switch. Switches are "intelligent" hubs: they track devices connected to them, help manage network load, and can manage separate internal networks with ease! The biggest thing that separates a switch from a hub is the MAC address table.
A network switch maintains an internal address book containing the MAC addresses of the devices connected to it. Remember that data frames contain both a source and destination MAC address? This is how the switch stays up to date! It uses this data to perform one of three actions with each piece of data it receives:
flood: When a destination address is unknown, the switch will flood received data out to all connected devices except where the data came from. When the intended recipient responds, the switch will update its MAC address table accordingly. This is how switches learn about connected devices, and it's significantly more efficient than a hub's behavior of flooding all the time!
forward: When a switch already has the destination MAC address in its internal table, it can send data directly to that device. This is called forwarding the data. No other devices connected to the switch are made aware of this data.
filter: Sometimes a switch will receive data on the same connector the data is destined for. In these cases, the switch will filter, or drop, the data entirely. This can be a little confusing to think about! If data arrives on the same connector it would later be sent out of, then we can assume the data was handled by some other part of our network, and the receiving switch can't do anything to help. Remember that this is very specifically related to the physical connector the data is received on: a switch will never act on data that comes in and goes out the same connector.
Switches often look just like hubs, but come in a much larger range of sizes. They can be chained together to cover large networks, or a 5-connector switch might be used for a small home network. Switches have improved on hubs' limitations: they don't share bandwidth, so you'll see less impact on speed through a switch than you might through a hub.
If you're building a home or single-location office network today, switches are your friend. They provide lots of simple management functionality for not much cash, and they're easy to keep around.
Here's a thought experiment: let's think about what a switch for a national network might look like. Since switches can be chained together, we wouldn't need thousands of connectors - but we would need lots of memory! The MAC address table would need to hold entries for every computer in the country. No way!
Instead of trying to solve this problem with switches, we have a higher-level device: the router. Routers connect separate networks with each other. Instead of identifying devices via MAC address, they use IP addresses to make decisions about data.
A router, like a switch, maintains an internal table of addresses. This routing table is used to pass received data on through the network. Data may move on to another router, or the router may recognize the data and pass it to an internal switch.
Routers also participate in an important process called NAT, short for Network Address Translation. NAT helps minimize IP address overload by giving the router a single IP address to use for all external communication. The router then uses IP ports to map incoming data to internal device IP addresses in its routing table. Imagine living in an apartment complex with a mail office. The postal service could bring packages with your apartment number to the front desk, and the mail officer would dole those packages out, but the sender would never actually have to know your name or exactly where your apartment is located! NAT provides a tiny bit of security and allows significantly more computers to coexist on the Internet simultaneously.
Because of the extra processing power required to handle large routing tables & filtering, routers tend to cost substantially more than switches. However, most networks only need one router! We sometimes describe routers as being our gateway to the Internet.
Physically, routers come in all shapes and sizes. They only need two connectors (one incoming, one outgoing), but they often come as part of an integrated device with multiple connectors and functions.
That's enough theory! Let's think through a practical example of each of the pieces of network hardware we've discussed.
Imagine you need to get a message to a friend across the room. There are lots of people between you and your friend: how might you communicate?
One way might be to shout. If you shout your message to the room, the rest of the people in the room could repeat it to make sure it's heard! This is going to result in some ringing ears, but your friend will definitely hear the message. This is how hubs work: broadcast to everyone, no matter who's listening.
An alternative might be to whisper. You could pass a message to the person right next to you. Of course, they won't know everyone in the room, so they'll have to ask the people close to them to pass it along. It may take a bit to get to your friend the first time, but any responses will be lightning-fast since everyone now knows who to talk to. This is a switched model: flood the message once to learn how to get to the destination, then use what we've learned to forward & filter appropriately.
Finally, imagine your friend has left the room. Uh oh! We could still use the shouting or whispering approach, but none of that matters since we now need to find the correct room your friend is in. To do this, we'd need people familiar with each room. We could pass our message to those gatekeepers, who could pass the message for us from room to room until it reaches the correct one. At that point, the process will reverse in the new room: the gatekeeper will whisper or shout, the room will respond accordingly, and responses will come back via the gatekeepers. The gatekeepers are acting as our routers here: they pass our messages between rooms, but don't necessarily care how the room communicates internally.
Notice in this example that each type of communication has a purpose, and all of them work together. We're simplifying things, but it's easy to concoct an example where we need all three types, or where we only need one. This is true of hardware was well. You should choose the correct devices for the network you're building. Not thinking through your needs may result in a steep cost, both in terms of performance and in terms of money!
It's easiest to think of these three classes of hardware as separate, distinct devices. However, this won't always be the case. Let's discuss some situations where these devices exist in unfamiliar packages.
When you set up Internet at a new home, you usually get a modem from your ISP. Years ago, this modem was dead simple: one inbound connection for your phone line or cable, and one outbound connection for your home PC. Today, however, our homes have multiple devices! Consumers became increasingly frustrated with getting a modem from one company but still needing a router and/or switch to connect all their computers.
In response, ISPs upped their game by integrating extra devices in their "modems". The average home gateway today includes:
Woah! That's a lot of hardware in one device. This behavior has blurred the lines between types of hardware and made communication about networked devices more difficult. Your ISP might call that single device a "modem", a "router", a "gateway", or an "access point". All of these are true!
Before making decisions about a network you're investigating, make sure you understand what each device is doing. Does that router include a switch? Is the modem a simple modem or does it include a hub as well? As always, make sure you're using the right tool for the job.
There are lots of different types of network hardware out there, but most of them fall into three separate categories. Hopefully, the next time you see inside a messy server closet, you'll be curious about which parts you can identify!
After this reading, you should be comfortable with:
The Transmission Control Protocol (TCP) is the backbone of the Internet. The majority of services we use rely on it, and a deeper understanding of it will help you navigate those services with ease.
Let's deep-dive into TCP! We'll cover:
Just like the Internet Protocol (IP) uses packets, TCP uses data units called segments. Segments are formed from application data: TCP receives this data, breaks it down into transmittable units, and attaches a header to each unit. This header contains everything we need to ensure a reliable connection is established.
As with IP, TCP header fields are critical to understand. They help satisfy two needs of the protocol: reliable data transfer and consistent connections.
Here's an overview of each field in the order they appear:
Source / Destination Port: The first two fields indicate which port the request originated on and which port it's directed to. This will be used, along with the IP address in the IP wrapper containing the segment, to determine which sockets to use for the TCP connection.
Sequence Number: This number is used to establish the correct ordering of data. At the start of a connection, TCP sets an Initial Sequence Number (ISN) that's sufficiently large enough to avoid conflicts. Each byte of data transferred is then counted, beginning at the ISN, and used to calculate the sequence number for the segment. Sequence numbers go hand-in-hand with the ...
Acknowledgement Number: This number lets the sender know which sequence is
expected next. Acknowledgement numbers are cumulative, which is how TCP
maintains proper ordering of segments. The receiver will send an
acknowledgement number that's one higher than the last sequence number plus
the length of the last data received. For example: if the last sequence number
received was 10
, and the accompanying data was 4 bits long, the response
will include an acknowledgement number of 15
- meaning "I'm ready for
sequence number 15!".
Data Offset: Defines how long the segment header is. This lets us know if there are options later on in the header or not.
Reserved: A short range of three bits, held over for later use. These will
always be 0
.
Control Flags: These nine bits drive the TCP connection process. We'll break them down in more detail soon.
Window Size: This field is used to let the sender know how much data the receiver can accept. This helps maintain the reliability of a connection: if a receiver is getting overloaded, they can lower the window size as a way of saying "slow down!". If a slow connection can move faster, a larger window size is a way of saying "bring it on!".
Checksum: The checksum is an error-checking mechanism. Checksums are used to check the validity of a particular segment, not the whole series of segment (as with sequence/acknowledgement). The TCP client can use the checksum to ensure the segment has been received correctly. If it doesn't match expectations, the segment will be discarded & ignored.
Urgent Pointer: TCP allows for data to be marked as urgent. This means it should be processed right away, regardless of sequence, interrupting any other transfer in process. This is useful when trying to terminate a long transfer, as we'd like to kill the connection without waiting for it to complete. This field indicates how much urgent data to expect, if there is any.
Options: Like most other protocols, the TCP segment header includes a range for options at its end. There may be no options, or there may be a handful! We can verify whether there are options be comparing the length of the header so far to the data offset field. TCP header options are mostly used for flow control and may even include padding to fill out the expected length of the header with empty data.
Immediately after any options (or after the urgent pointer, if no options are given), the encapsulated data from our application begins.
Remember that TCP is a connection-oriented protocol. This means that it establishes a long-running line of communication between two points, instead of just shouting into the Internet void like UDP. Establishing this connection involves a series of predictable steps, each with specific names. Most of these steps are driven by the control flags in the segment headers.
The segment header has 12 bits reserved for control flags. The first three of these are currently unused and will always be zero, and the next three are all used by congestion-management extensions to the protocol. The control flags that most concern us are the final six bits, each known by a short three-letter name. Let's review them in order:
URG: The "urgent" bit. Lets us know if this segment contains urgent data.
ACK: The "acknowledgement" bit. Setting this bit means a message has been received successfully.
PSH: The "push" bit. This bit is used to indicate that buffered data should be passed on to the connected application.
RST: The "reset" bit. This bit means we should reset the connection. Receivers will send an RST segment when they receive unexpected data, either to a port that's not listening or dramatically out-of-sequence.
SYN: The "synchronize" bit. This flag is set on the the first segment from each side of the connection, and should include an ISN for the socket to begin sequencing from.
FIN: The "finished" bit. This bit lets each side know that transmission is done and the connection may be closed.
Note that we mention all twelve bits here as "control bits" even though the first three are unassigned. While they are currently unused, protocols evolve frequently! The TCP specification lists "six reserved bits, six control bits", but newer specs have claimed some of those. For this reason, you should think of the remaining three reserved bits as "control flags still under development".
TCP connections begin with a process called a three-way-handshake, also sometimes referred to as SYN-SYN-ACK. This name comes from the three interactions before the connection is officially "open":
SYN
segment.ACK
and its own SYN
.ACK
nowledges with another segment to the server. Now both sides
are ready to go!Notice that we can send both a SYN
and ACK
on a single segment. This is
called piggybacking and saves us a ton of requests! The three-way handshake
would become a four-way handshake if we had to send the SYN
and ACK
separately.
TCP connections are often visualized using ladder diagrams, also sometimes called timing diagrams. Let's take a look at one for the three-way handshake:
This diagram should be read top-to-bottom. Each arrow represents a single
segment being passed between hosts. You can see how the client initiates the
request, but the server mirrors the process. This ensures both sides are ready
to work: if the client's SYN
request fell on deaf ears, we would expect an
RST
segment back.
You can also see how the sequence number and acknowledgement number are incremented for each segment. Since these initial segments contain no data, each only increments by one. During data transfer, the numbers will increment based on the length of data received so far.
TCP maintains timers for most behaviors to ensure that connections don't hang empty forever. This is one reason time-based diagrams are so helpful: by adjusting the angle of the arrows between the client & server, we can indicate a faster or slower connection. This can be helpful for visualizing connections on a granular level. For example, here's a simplified diagram of the same three-way handshake with a slow server response:
Once the connection has been established, we're off to the races! The client
will send data segments over and the server will respond with ACK
segments.
Remember that the sequence number in each data segment indicates where our data
starts, and the corresponding acknowledgement number should be the next position
after our data ends. For example, a data segment with a sequence number of 4
and data of length 3
will be ACK
'ed with an acknowledgement number of 7
:
our data included 4
, 5
, and 6
in the sequence, and the server is letting
us know that it's ready for data beginning at sequence number 7
.
Here's a fun way of visualizing this concept via text messages between the client & server:
Note that there
won't be any more SYN
segments unless the connection terminates unexpectedly:
we only send segments with the SYN
flag enabled when initializing a
connection.
The acknowledgement number is important for keeping the TCP connection reliable.
It will only increment when a segment is successfully received, so an ACK
response to the client with a lower acknowledgement number than the client's
current sequence number means a segment was missed and must be retransmitted.
A diagram is worth a thousand words:
This is a major part of TCP, and one of the reasons it's both reliable and slow: data may need to be retransmitted frequently, but the end result is always a full & correct payload on the server.
Once we've sent all our data across the wire, it's time to say au revoir. By default, TCP closes open connections similarly to the way they're opened: lots of handshakery! By default, this time it's a four-way handshake.
Let's take a look at a diagram of a TCP connection closing:
This diagram is similar to the three-way handshake we looked at for connection establishment, so it may raise a question for you: why the extra handshake? The reason, as usual, is reliability!
Remember how TCP maintains timers between segments.? This is because no matter how much reliability we've worked into transport protocols, they're still built on top of unreliable protocols and infrastructure. The same is true of a closing connection: we don't want to act too quickly or we may miss a piece of extra important data.
When closing a connection, both sides wait a beat before actually closing. This
allows any delayed segments to slip in at the last minute! This also means it is
impractical for the server to send a piggybacked FIN
& ACK
in the same way
it sends a SYN
& ACK
to open the connection. If the server waited before
sending an ACK
, the client may think something went wrong and begin
retransmitting. To prevent this, the server responses are separated:
ACK
to the client's first FIN
right away,FIN
segment to let the client know it's
shutting down.The client will ACK
nowledge immediately, but may also wait a bit before truly
closing, just in case. With all the handshaking and waiting around, you might
call TCP a "considerate" protocol as well!
Remember that a TCP connection is between two sockets, or joint IP
address/port pairs. As the connection progresses, these sockets change state.
For example, during the process of data transfer, both sockets are considered to
be in the ESTABLISHED
state, while before a connection is established a
server's open socket would be in the LISTEN
state.
It's important to remember that this flow isn't identical for the client &
server, too: after the client sends its SYN
, it enters a SYN SENT
state
while the server enters a SYN RCVD
(or "SYN
Received") state of its own.
Here's a diagram of a simple data transfer from start to finish. Notice that we've added the socket states for each side of the connection outside the diagram. These may be noted by network tools and can be helpful to reference if you find yourself debugging a TCP connection problem:
The original TCP specification includes an alternative text-based chart for the lifecycle of these socket states, and you can read more about this process on the Wikipedia page for TCP.
Wow! It's incredible to visualize everything happening across the wire when you browse the Web. TCP is such a foundational piece of your daily interactions with the Internet and will only become more so as you begin contributing to the Web yourself!
After reading this lesson, you should have a clear understanding of:
traceroute
Remember that the Internet is a "network of networks"1. This can make it tough to debug problems between networks: how do we identify the culprit? Enter traceroute! Let's explore this utility and learn how to find problems with inter-network connections.
We'll cover:
traceroute
is and when to use it,traceroute
output,traceroute
to solve problems.When we create connections between networks, they're rarely direct. We've already discussed how the IP protocol works to connect devices across multiple intermediaries. While the theory and addressing makes sense, it raises a new challenge: how do we solve problems when we can't get access to the other networks involved?
Thankfully, there's a utility that lets us peek at devices along the way. It's
called traceroute
. The traceroute
utility runs on the command line and uses
UDP packets to monitor each device that data passes through as it moves from the
source location to the target location. Using this utility, we can determine
where a network failure or slowdown might be occurring.
traceroute
vs.tracert
If you do some research of your own, you may find
traceroute
referred to astracert
. These utilities work slightly differently.tracert
is included with the Windows operating system, and uses ICMP (an alternative protocol also used by theping
utility) to trace data.traceroute
is on Unix-based operating systems, including macOS, and uses UDP. We'll prefertraceroute
here, but the output of both utilities is almost identical! If you find yourself investigating network problems on a Windows computer, the skills covered here will translate seamlessly.
You'll sometimes hear using traceroute
referred to as "running a trace".
Running a trace of your own is easy - enter the following on your command line:
> traceroute appacademy.io
Boom! Your first trace! So what does all that gibberish mean? Let's break it down.
Here's a screenshot of traceroute appacademy.io
from my terminal:
First, let's look at the overall breakdown. At the top of the trace, you'll see
some general info. There's a warning about "multiple addresses", meaning that
appacademy.io
resolves to more than one IP address. This is common for popular
websites that use multiple servers to reduce traffic and keep speeds high. Next,
you'll see the true beginning of the trace, including the domain we're tracing
to, the IP address it resolved to, and a maximum number of hops ("64") and
packet size "(52 byte").
We mentioned hops when discussing networking protocols, but here's a quick
review: a "hop" is one connection to another server. Think of the houses on your
street or apartments on your hall. A single hop would walking to your next door
neighbor's home. Three hops would be walking three doors down. When we run
traceroute
, it tracks the location of each hop, but it limits the number of
hops to make sure we're not searching for an unavailable address forever! Our
traceroute
won't go more than 64 hops, though this default may differ across
systems.
Below the general info, we see a numbered list of addresses. Each line represents one hop, and includes some important info about that destination. Let's check it out!
Here's our first hop:
We know it's the first because of the 1
on the left side. Each hop is preceded
by a number indicating how many hops it took to get there.
Next, we see two IP addresses. These identify the network location our trace has reached. In this case, the location has no resolvable DNS name so we just see the IP address. If you look ahead a bit, you can see that addresses with a resolvable name will show that name first instead.
Finally, we see three numbers. These are time intervals, (indicated by the "ms",
short for "milliseconds") that let us know how long it took us to reach this
location from our system. When we run a trace, traceroute
attempts each hop
three times. This is because UDP (and the Internet beneath it!) is inherently
unreliable. Testing the hop three times ensures we get truly representative data
and not a false reading due to a dropped packet or network congestion. Each of
the three numbers we see is the response time from one of those attempts.
They'll always vary slightly, but we can get a good idea of the average response
time. In this case, the numbers are very small: two tenths of a millisecond!
Wow!.
This first hop is my home internet router! We can tell this not only because it's the first device reached beyond my own computer, but also because the IP address is within one of IPv4's reserved ranges, meant for private networks inside homes and businesses. It's unlikely you'll see reserved addresses outside of your current network.
Hops proceed from our own device to the gateway for the device we're trying to reach. We can analyze each hop using the same breakdown.
The first hop in our trace is straightforward, but that's not always the case! Let's take a look at a few not-so-standard situations you may see come up.
Notice that our second hop doesn't have any of the info we expect from
traceroute
. Instead, it has three asterisks:
In traceroute
-speak, an asterisk (*
) represents a hop with no response. This
doesn't necessarily mean the hop failed, just that our system didn't get a
response back! This could be due to server configuration or a slow connection.
By default, a hop will return a *
if there's no response for five seconds.
There are three asterisks in this case to indicate that all three attempts went
unanswered.
Let's also take a look at our eighth hop:
This one entry has multiple addresses! What's going on? This is an example of load balancing in action. In this case, the router at our seventh hop is directing traffic to multiple locations. This balances the load and ensures that one single router isn't handling too much.
In this case, one of our hop attempts went to the atln.ga
(Atlanta, Georgia)
router, while the other two went to the snjs.ca
(San Jose, California) router.
We still see timestamps for all three hop attempts.
Because of load balancing, your connection to appacademy.io
isn't guaranteed
to be the same each time. Try running traceroute appacademy.io
again - do you
see the exact same addresses as before?
Tracing is most useful for diagnosing network connectivity issues. Imagine your
internet at home suddenly goes down! You can traceroute
to a familiar domain
to see if the connection fails before it gets to your own router (in which case,
it's likely a problem on your device), or it fails at a network outside yours.
You can also diagnose slow connections this way! If a hop has a very long response time (> 50 ms), then it's possible that a previous device in line is experiencing downtime or network congestion.
traceroute
is simple to use without them, but does include some command line
arguments that can enhance its abilities. Check out man traceroute
to learn
more about what it can do!
When in doubt, trace-
the -route
! Tracing network traffic using the
traceroute
utility is a great way to identify what's happening outside your
own network.
After this lesson, you should be comfortable:
traceroute
to diagnose connection problems,traceroute
command,traceroute
is the correct tool for the job.1: https://www.encyclopedia.com/computing/news-wires-white-papers-and-books/network-networks
Wireshark, a network analysis tool formerly known as Ethereal, captures packets in real time and display them in human-readable format. Wireshark includes filters, color coding, and other features that let you dig deep into network traffic and inspect individual packets.
This tutorial will get you up to speed with the basics of capturing packets, filtering them, and inspecting them. You can use Wireshark to inspect a suspicious program's network traffic, analyze the traffic flow on your network, or troubleshoot network problems.
You can download Wireshark for Windows or macOS from its official website. If you're using Linux or another UNIX-like system, you'll probably find Wireshark in its package repositories. For example, if you're using Ubuntu, you'll find Wireshark in the Ubuntu Software Center.
Just a quick warning: Many organizations don't allow Wireshark and similar tools on their networks. Don't use this tool at work unless you have permission.
After downloading and installing Wireshark, you can launch it and double-click the name of a network interface under Capture to start capturing packets on that interface. For example, if you want to capture traffic on your wireless network, click your wireless interface. You can configure advanced features by clicking Capture > Options, but this isn't necessary for now.
As soon as you click the interface's name, you'll see the packets start to appear in real time. Wireshark captures each packet sent to or from your system.
If you have promiscuous mode enabled—it's enabled by default—you'll also see all the other packets on the network instead of only packets addressed to your network adapter. To check if promiscuous mode is enabled, click Capture > Options and verify the "Enable promiscuous mode on all interfaces" checkbox is activated at the bottom of this window.
Click the red "Stop" button near the top left corner of the window when you want to stop capturing traffic.
The packet list pane, located at the top of the window, shows all packets found in the active capture file. Each packet has its own row and corresponding number assigned to it, along with each of these data points:
When a packet is selected in the top pane, you may notice one or more symbols appear in the No. column. Open or closed brackets and a straight horizontal line indicate whether a packet or group of packets are part of the same back-and-forth conversation on the network. A broken horizontal line signifies that a packet is not part of the conversation.
You'll probably see packets highlighted in a variety of different colors. Wireshark uses colors to help you identify the types of traffic at a glance. By default, light purple is TCP traffic, light blue is UDP traffic, and black identifies packets with errors—for example, they could have been delivered out of order.
To view exactly what the color codes mean, click View > Coloring Rules. You can also customize and modify the coloring rules from here, if you like.
If there's nothing interesting on your own network to inspect, Wireshark's wiki has you covered. The wiki contains a page of sample capture files that you can load and inspect. Click File > Open in Wireshark and browse for your downloaded file to open one.
Download, open, and inspect each of these capture files so you can see what it looks like for that communication to have occurred over a network.
You can also save your own captures in Wireshark and open them later. Click the File > Save to save your captured packets.
If you're trying to inspect something specific, such as the traffic a program sends when phoning home, it helps to close down all other applications using the network so you can narrow down the traffic. Still, you'll likely have a large amount of packets to sift through. That's where Wireshark's filters come in.
The most basic way to apply a filter is by typing it into the filter box at the top of the window and clicking Apply (or pressing Enter). For example, type "dns" and you'll see only DNS packets. When you start typing, Wireshark will help you autocomplete your filter.
You can also click Analyze > Display Filters to choose a filter from among the default filters included in Wireshark. From here, you can add your own custom filters and save them to easily access them in the future.
Another interesting thing you can do is right-click a packet and select Follow > TCP Stream.
You'll see the full TCP conversation between the client and the server. You can also click other protocols in the Follow menu to see the full conversations for other protocols, if applicable.
Click a packet to select it and you can dig down to view its details.
You can also create filters from here — just right-click one of the details and use the Apply as Filter submenu to create a filter based on it.
Wireshark is an extremely powerful tool, and this tutorial is just scratching the surface of what you can do with it. Professionals use it to debug network protocol implementations, examine security problems and inspect network protocol internals.
Now, start Wireshark, start capturing network traffic, and perform various tasks on your computer using different applications. You will be surprised by the number of network requests made by your computer. See if you can identify the application that makes each request and the significance of each request.
There is a lot of terminology and knowledge in today's lessons. These aren't skills that you can necessarily apply. However, the body of knowledge is essential to understand how networking works both in ideal and practical senses.
Today, it's up to you to make your own flashcards for studying. You can make then in your pair and share the Anki files. Use them to study because this will likely be on your assessment and in some interviews.