Computer Networks and the Internet

俯瞰计算机网络与因特网

This first chapter presents a broad overview of computer networking and the Internet.

After introducing some basic terminology and concepts, we’ll first examine the basic hardware and software components that make up a network. We’ll begin at the network’s edge and look at the end systems and network applications running in the network. We’ll then explore the core of a computer network, examining the links and the switches that transport data, as well as the access networks and physical media that connect end systems to the network core. We’ll learn that the Internet is a network of networks, and we’ll learn how these networks connect with each other.

After having completed this overview of the edge and core of a computer network, we’ll take the broader and more abstract view in the second half of this chapter. We’ll examine delay, loss, and throughput of data in a computer network and provide simple quantitative models for end-to-end throughput and delay: models that take into account transmission, propagation, and queuing delays. We’ll then introduce some of the key architectural principles in computer networking, namely, protocol layering and service models. We’ll also learn that computer networks are vulnerable to many different types of attacks; we’ll survey some of these attacks and consider how computer networks can be made more secure. Finally, we’ll close this chapter with a brief history of computer networking.

What Is the Internet?

In this book, we’ll use the public Internet, a specific computer network, as our principal vehicle for discussing computer networks and their protocols. But what is the Internet?

基于硬件和软件的描述

In Internet jargon, all of these devices are called hosts or end systems. End systems are connected together by a network of communication links and packet switches.

When one end system has data to send to another end system, the sending end system segments the data and adds header bytes to each segment. The resulting packages of information, known as packets in the jargon of computer networks, are then sent through the network to the destination end system, where they are reassembled into the original data.

Packet switches come in many shapes and flavors, but the two most prominent types in today’s Internet are routers and link-layer switches.

Link-layer switches are typically used in access networks, while routers are typically used in the network core.

The sequence of communication links and packet switches traversed by a packet from the sending end system to the receiving end system is known as a route or path through the network.

Packet-switched networks (which transport packets) are in many ways similar to transportation networks of highways, roads, and intersections (which transport vehicles).

In many ways, packets are analogous to trucks, communication links are analogous to highways and roads, packet switches are analogous to intersections, and end systems are analogous to buildings. Just as a truck takes a path through the transportation network, a packet takes a path through a computer network.

End systems access the Internet through Internet Service Providers (ISPs). Each ISP is in itself a network of packet switches and communication links. ISPs provide a variety of types of network access to the end systems. The Internet is all about connecting end systems to each other, so the ISPs that provide access to end systems must also be interconnected.

End systems, packet switches, and other pieces of the Internet run protocols that control the sending and receiving of information within the Internet. The Transmission Control Protocol (TCP) and the Internet Protocol (IP) are two of the most important protocols in the Internet. The IP protocol specifies the format of the packets that are sent and received among routers and end systems. The Internet’s principal protocols are collectively known as TCP/IP.

The IETF standards documents are called requests for comments (RFCs).

基于服务的描述

We can also describe the Internet from an entirely different angle—namely, as an infrastructure that provides services to applications. These applications include electronic mail, Web surfing, social networks, instant messaging, Voice-over-IP (VoIP), video streaming, distributed games, peer-to-peer (P2P) file sharing, television over the Internet, remote login, and much, much more.

The applications are said to be distributed applications, since they involve multiple end systems that exchange data with each other. Importantly, Internet applications run on end systems—they do not run in the packet switches in the network core. Although packet switches facilitate the exchange of data among end systems, they are not concerned with the application that is the source or sink of data.

Because applications run on end systems, you are going to need to write programs that run on the end systems. How does one program running on one end system instruct the Internet to deliver data to another program running on another end system? End systems attached to the Internet provide an Application Programming Interface (API) that specifies how a program running on one end system asks the Internet infrastructure to deliver data to a specific destination program running on another end system.

This Internet API is a set of rules that the sending program must follow so that the Internet can deliver the data to the destination program.

The Internet provides multiple services to its applications. When you develop an Internet application, you too must choose one of the Internet’s services for your application.

因特网的核心：什么是协议？

We have just given two descriptions of the Internet; one in terms of its hardware and software components, the other in terms of an infrastructure for providing services to distributed applications.

What are packet switching and TCP/IP? What are routers? What kinds of communication links are present in the Internet? What is a distributed application? How can a toaster or a weather sensor be attached to the Internet? If you feel a bit overwhelmed by all of this now, don’t worry—the purpose of this book is to introduce you to both the nuts and bolts of the Internet and the principles that govern how and why it works.

Now that we’ve got a bit of a feel for what the Internet is, let’s consider another important buzzword in computer networking: protocol. What is a protocol? What does a protocol do?

It is probably easiest to understand the notion of a computer network protocol by first considering some human analogies, since we humans execute protocols all of the time.

A network protocol is similar to a human protocol, except that the entities exchanging messages and taking actions are hardware or software components of some device. Protocols are running everywhere in the Internet, and consequently much of this book is about computer network protocols.

A protocol defines the format and the order of messages exchanged between two or more communicating entities, as well as the actions taken on the transmission and/or receipt of a message or other event.

Mastering the field of computer networking is equivalent to understanding the what, why, and how of networking protocols.

The Network Edge

In the previous section we presented a high-level overview of the Internet and networking protocols. We are now going to delve a bit more deeply into the components of a computer network (and the Internet, in particular). We begin in this section at the edge of a network and look at the components with which we are most familiar—namely, the computers, smartphones and other devices that we use on a daily basis.

The computers and other devices connected to the Internet are often referred to as end systems. They are referred to as end systems because they sit at the edge of the Internet. End systems are also referred to as hosts because they host (that is, run) application programs such as a Web browser program, a Web server program, an email client program, or an email server program. Hosts are sometimes further divided into two categories: clients and servers.

Access Networks

Access network—the network that physically connects an end system to the first router. A residence typically obtains DSL Internet access from the same local telephone company (telco) that provides its wired local phone access. Thus, when DSL is used, a customer’s telco is also its ISP. The residential telephone line carries both data and traditional telephone signals simultaneously, which are encoded at different frequencies. While DSL makes use of the telco’s existing local telephone infrastructure, cable Internet access makes use of the cable television company’s existing cable television infrastructure. One important characteristic of cable Internet access is that it is a shared broadcast medium. Wide-Area Wireless Access: 3G and LTE, for smartphones.

Physical Media

In the previous subsection, we gave an overview of some of the most important network access technologies in the Internet. As we described these technologies, we also indicated the physical media used. For example, we said that HFC uses a combination of fiber cable and coaxial cable. We said that DSL and Ethernet use copper wire. And we said that mobile access networks use the radio spectrum.

In order to define what is meant by a physical medium, let us reflect on the brief life of a bit. For each transmitter-receiver pair, the bit is sent by propagating electromagnetic waves or optical pulses across a physical medium. Physical media fall into two categories: guided media and unguided media.

Twisted-Pair Copper Wire
Coaxial Cable Fiber
Optics Terrestrial Radio Channels
Satellite Radio Channels 因特网接入和物理介质的分类这块，大概有个认知就行。

The Network Core

Network core—the mesh of packet switches and links that interconnects the Internet’s end systems.

Packet Switching

In a network application, end systems exchange messages with each other. To send a message from a source end system to a destination end system, the source breaks long messages into smaller chunks of data known as packets. Between source and destination, each packet travels through communication links and packet switches.

Most packet switches use store-and-forward transmission at the inputs to the links. Only after the router has received all of the packet’s bits can it begin to transmit (i.e., “forward”) the packet onto the outbound link. At time L/R seconds, since the router has just received the entire packet, it can begin to transmit the packet onto the outbound link towards the destination; at time 2L/R, the router has transmitted the entire packet, and the entire packet has been received by the destination. Thus, the total delay is 2L/R. as we will discuss in Section 1.4, routers need to receive, store, and process the entire packet before forwarding. If an arriving packet needs to be transmitted onto a link but finds the link busy with the transmission of another packet, the arriving packet must wait in the output buffer. Thus, in addition to the store-and-forward delays, packets suffer output buffer queuing delays. Since the amount of buffer space is finite, an arriving packet may find that the buffer is completely full with other packets waiting for transmission. In this case, packet loss will occur—either the arriving packet or one of the already-queued packets will be dropped. But how does the router determine which link it should forward the packet onto? Packet forwarding is actually done in different ways in different types of computer networks. Here, we briefly describe how it is done in the Internet.

In the Internet, every end system has an address called an IP address. When a source end system wants to send a packet to a destination end system, the source includes the destination’s IP address in the packet’s header. As with postal addresses, this address has a hierarchical structure. When a packet arrives at a router in the network, the router examines a portion of the packet’s destination address and forwards the packet to an adjacent router. More specifically, each router has a forwarding table that maps destination addresses (or portions of the destination addresses) to that router’s outbound links. When a packet arrives at a router, the router examines the address and searches its forwarding table, using this destination address, to find the appropriate outbound link. The router then directs the packet to this outbound link.

The end-to-end routing process is analogous to a car driver who does not use maps but instead prefers to ask for directions.

We just learned that a router uses a packet’s destination address to index a for- warding table and determine the appropriate outbound link. But this statement begs yet another question: How do forwarding tables get set?

This issue will be studied in depth in Chapter 4. But to whet your appetite here, we’ll note now that the Internet has a number of special routing protocols that are used to automatically set the forwarding tables.

get your hands dirty by interacting with the Traceroute program

Circuit Switching

There are two fundamental approaches to moving data through a network of links and switches: circuit switching and packet switching. Having covered packet- switched networks in the previous subsection, we now turn our attention to circuit- switched networks.

In circuit-switched networks, the resources needed along a path (buffers, link transmission rate) to provide for communication between the end systems are reserved for the duration of the communication session between the end systems.

For the restaurant that does not require reservations, we don’t need to bother to reserve a table. But when we arrive at the restaurant, we may have to wait for a table before we can be seated.

Traditional telephone networks are examples of circuit-switched networks.

This is a bona fide connection for which the switches on the path between the sender and receiver maintain connection state for that connection. In the jargon of telephony, this connection is called a circuit.

Since a given transmission rate has been reserved for this sender-to-receiver connection, the sender can transfer the data to the receiver at the guaranteed constant rate.

When two hosts want to communicate, the network establishes a dedicated end-to-end connection between the two hosts.

A circuit in a link is implemented with either frequency-division multiplexing (FDM) or time-division multiplexing (TDM). Proponents of packet switching have always argued that circuit switching is wasteful because the dedicated circuits are idle during silent periods.

Packet Switching Versus Circuit Switching

Critics of packet switching have often argued that packet switching is not suitable for real-time services.

Proponents of packet switching argue that (1) it offers better sharing of transmission capacity than circuit switching and (2) it is simpler, more efficient, and less costly to implement than circuit switching.

Although packet switching and circuit switching are both prevalent in today’s telecommunication networks, the trend has certainly been in the direction of packet switching. Even many of today’s circuit-switched telephone networks are slowly migrating toward packet switching.

A Network of Networks

We saw earlier that end systems (PCs, smartphones, Web servers, mail servers, and so on) connect into the Internet via an access ISP. The access ISP can provide either wired or wireless connectivity, using an array of access technologies including DSL, cable, FTTH, Wi-Fi, and cellular. Note that the access ISP does not have to be a telco or a cable company; instead it can be, for example, a university. And, the access ISPs themselves must be interconnected.

This is done by creating a network of networks—understanding this phrase is the key to understanding the Internet.

Over the years, the network of networks that forms the Internet has evolved into a very complex structure. Much of this evolution is driven by economics and national policy, rather than by performance considerations.

Our first network structure, Network Structure 1, interconnects all of the access ISPs with a single global transit ISP.

Now if some company builds and operates a global transit ISP that is profitable, then it is natural for other companies to build their own global transit ISPs and com- pete with the original global transit ISP.

For example, in China, there are access ISPs in each city, which connect to provincial ISPs, which in turn connect to national ISPs, which finally connect to tier-1 ISPs.

In recent years, major content providers have also created their own networks and connect directly into lower-tier ISPs where possible.

Delay, Loss, and Throughput in Packet-Switched Networks

Instead, computer networks necessarily constrain throughput (the amount of data per second that can be transferred) between end systems, introduce delays between end systems, and can actually lose packets.

The most important of these delays are the nodal processing delay, queuing delay, transmis- sion delay, and propagation delay; together, these delays accumulate to give a total nodal delay.

The time required to examine the packet’s header and determine where to direct the packet is part of the processing delay.

At the queue, the packet experiences a queuing delay as it waits to be transmitted onto the link.

The transmission delay is L/R. This is the amount of time required to push (that is, transmit) all of the packet’s bits into the link. Transmission delays are typically on the order of microseconds to milliseconds in practice.

The time required to propagate from the beginning of the link to router B is the propagation delay.

The propagation speed depends on the physical medium of the link (that is, fiber optics, twisted-pair copper wire, and so on) and is in the range of

which is equal to, or a little less than, the speed of light.

Comparing Transmission and Propagation Delay

Newcomers to the field of computer networking sometimes have difficulty understanding the difference between transmission delay and propagation delay. The difference is subtle but important. The transmission delay is the amount of time required for the router to push out the packet; it is a function of the packet’s length and the transmission rate of the link, but has nothing to do with the distance between the two routers. The propagation delay, on the other hand, is the time it takes a bit to propagate from one router to the next; it is a function of the distance between the two routers, but has nothing to do with the packet’s length or the transmission rate of the link.

Unlike the other three delays (namely, d proc , d trans , and d prop ), the queuing delay can vary from packet to packet.

Therefore, one of the golden rules in traffic engineering is: Design your system so that the traffic intensity is no greater than 1.

Instead, a packet can arrive to find a full queue. With no place to store such a packet, a router will drop that packet; that is, the packet will be lost.

Therefore, performance at a node is often measured not only in terms of delay, but also in terms of the probability of packet loss.

A lost packet may be retransmitted on an end-to-end basis in order to ensure that all data are eventually transferred from source to destination.

In addition to delay and packet loss, another critical performance measure in com- puter networks is end-to-end throughput.

Thus, for this simple two-link network, the throughput is min{R c , R s }, that is, it is the transmission rate of the bottleneck link.

More generally the throughput depends not only on the transmission rates of the links along the path, but also on the intervening traffic. In particular, a link with a high transmission rate may nonetheless be the bottleneck link for a file transfer if many other data flows are also passing through that link.

Protocol Layers and Their Service Models

From our discussion thus far, it is apparent that the Internet is an extremely complicated system. We have seen that there are many pieces to the Internet: numerous applications and protocols, various types of end systems, packet switches, and various types of link-level media. Given this enormous complexity, is there any hope of organizing a network architecture, or at least our discussion of network architecture? Fortunately, the answer to both questions is yes.

Layered Architecture

We can see some analogies here with computer networking: You are being shipped from source to destination by the airline; a packet is shipped from source host to destination host in the Internet. Each layer provides its service by (1) performing certain actions within that layer (for example, at the gate layer, loading and unloading people from an airplane) and by (2) using the services of the layer directly below it (for example, in the gate layer, using the runway-to- runway passenger transfer service of the takeoff/landing layer).

For large and complex systems that are constantly being updated, the ability to change the implementation of a service without affecting other components of the system is another important advantage of layering.

To provide structure to the design of network protocols, network designers organize protocols—and the network hardware and software that implement the protocols— in layers. Each protocol belongs to one of the layers, just as each function in the airline architecture in Figure 1.22 belonged to a layer.

We are again interested in the services that a layer offers to the layer above—the so-called service model of a layer.

A protocol layer can be implemented in software, in hardware, or in a combination of the two.

Application-layer protocols—such as HTTP and SMTP—are almost always implemented in software in the end systems; so are transport-layer protocols.

Because the physical layer and data link layers are responsible for handling communication over a specific link, they are typically implemented in a network interface card (for example, Ethernet or WiFi interface cards) associated with a given link.

The network layer is often a mixed implementation of hardware and software.

Protocol layering has conceptual and structural advantages.

One potential drawback of layering is that one layer may duplicate lower-layer functionality.

A second potential drawback is that functionality at one layer may need information (for example, a time-stamp value) that is present only in another layer; this violates the goal of separation of layers.

When taken together, the protocols of the various layers are called the protocol stack.

The application layer is where network applications and their application-layer protocols reside. An application-layer protocol is distributed over multiple end systems, with the application in one end system using the protocol to exchange packets of information with the application in another end system. We’ll refer to this packet of information at the application layer as a message.

The Internet’s transport layer transports application-layer messages between application endpoints.In this book, we’ll refer to a transport-layer packet as a segment.

The Internet’s network layer is responsible for moving network-layer packets known as datagrams from one host to another. The network layer then provides the service of delivering the segment to the transport layer in the destination host. Although the network layer contains both the IP protocol and numerous routing protocols, it is often simply referred to as the IP layer, reflecting the fact that IP is the glue that binds the Internet together.

To move a packet from one node (host or router) to the next node in the route, the network layer relies on the services of the link layer. The network layer will receive a different service from each of the different link-layer protocols. In this book, we’ll refer to the link-layer packets as frames.

While the job of the link layer is to move entire frames from one network element to an adjacent network element, the job of the physical layer is to move the individual bits within the frame from one node to the next. The protocols in this layer are again link dependent and further depend on the actual transmission medium of the link In each case, a bit is moved across the link in a different way.

let’s consider the two additional layers present in the OSI reference model—the presentation layer and the session layer. What if an application needs one of these services? The Internet’s answer to both of these questions is the same—it’s up to the application developer.

Figure 1.24 shows the physical path that data takes down a sending end system’s protocol stack, up and down the protocol stacks of an intervening link-layer switch and router, and then up the protocol stack at the receiving end system.

Routers and link-layer switches do not implement all of the layers in the protocol stack; they typically implement only the bottom layers.

Figure 1.24 also illustrates the important concept of encapsulation. Thus, we see that at each layer, a packet has two types of fields: header fields and a payload field. The payload is typically a packet from the layer above.

The process of encapsulation can be more complex than that described above. For example, a large message may be divided into multiple transport-layer segments (which might themselves each be divided into multiple network-layer datagrams). At the receiving end, such a segment must then be reconstructed from its constituent datagrams.

Networks Under Attack

The field of network security is about how the bad guys can attack computer networks and about how we, soon-to-be experts in computer networking, can defend networks against those attacks, or better yet, design new architectures that are immune to such attacks in the first place.

So we begin here by simply asking, what can go wrong? How are computer networks vulnerable? What are some of the more prevalent types of attacks today?

Much of the malware out there today is self-replicating: once it infects one host, from that host it seeks entry into other hosts over the Internet, and from the newly infected hosts, it seeks entry into yet more hosts.

Viruses are malware that require some form of user interaction to infect the user’s device.

Worms are malware that can enter a device without any explicit user interaction.

Today, malware, is pervasive and costly to defend against.

Another broad class of security threats are known as denial-of-service (DoS) attacks.

The bad guys can sniff packets

A passive receiver that records a copy of every packet that flies by is called a packet sniffer.

Because packet sniffers are passive—that is, they do not inject packets into the channel—they are difficult to detect.

So, when we send packets into a wireless channel, we must accept the possibility that some bad guy may be recording copies of our packets.

The ability to inject packets into the Internet with a false source address is known as IP spoofing, and is but one of many ways in which one user can masquerade as another user.

To solve this problem, we will need end-point authentication, that is, a mechanism that will allow us to determine with certainty if a message originates from where we think it does.

Internet was originally designed to be that way, based on the model of “a group of mutually trusting users attached to a transparent network” [Blumenthal 2001]—a model in which (by definition) there is no need for security.

We now have many security-related challenges before us as we progress through this book: We should seek defenses against sniffing, end-point masquerading, man-in-the- middle attacks, DDoS attacks, malware, and more. We should keep in mind that communication among mutually trusted users is the exception rather than the rule. Welcome to the world of modern computer networking!

History of Computer Networking and the Internet

The Development of Packet Switching: 1961–1972

With an end-to-end protocol available, applications could now be written. Ray Tomlinson wrote the first email program in 1972.

Proprietary Networks and Internetworking: 1972–1980

The three key Internet protocols that we see today—TCP, UDP, and IP—were conceptually in place by the end of the 1970s. Well before the PC revolution and the explosion of networks, Metcalfe and Boggs were laying the foundation for today’s PC LANs.

A Proliferation of Networks: 1980–1990

The Internet Explosion: The 1990s

The 1990s were ushered in with a number of events that symbolized the continued evolution and the soon-to-arrive commercialization of the Internet.

By the end of the millennium the Internet was supporting hundreds of popular applications, including four killer applications:

E-mail, including attachments and Web-accessible e-mail
The Web, including Web browsing and Internet commerce
Instant messaging, with contact lists
Peer-to-peer file sharing of MP3s, pioneered by Napster

The New Millennium

Online social networks
Own extensive private networks
Cloud

Summary

In this chapter we’ve covered a tremendous amount of material! We’ve looked at the various pieces of hardware and software that make up the Internet in particular and computer networks in general. We started at the edge of the network, looking at end systems and applications, and at the transport service provided to the applications running on the end systems. We also looked at the link-layer technologies and physical media typically found in the access network. We then dove deeper inside the network, into the network core, identifying packet switching and circuit switching as the two basic approaches for transporting data through a telecommunication network, and we examined the strengths and weaknesses of each approach. We also examined the structure of the global Internet, learning that the Internet is a network of networks. We saw that the Internet’s hierarchical structure, consisting of higher- and lower-tier ISPs, has allowed it to scale to include thousands of networks.

In the second part of this introductory chapter, we examined several topics central to the field of computer networking. We first examined the causes of delay, throughput and packet loss in a packet-switched network. We developed simple quantitative models for transmission, propagation, and queuing delays as well as for throughput; we’ll make extensive use of these delay models in the homework problems throughout this book. Next we examined protocol layering and service models, key architectural principles in networking that we will also refer back to throughout this book. We also surveyed some of the more prevalent security attacks in the Internet day. We finished our introduction to networking with a brief history of computer networking.

The first chapter in itself constitutes a mini-course in computer networking.

Before starting any trip, you should always glance at a road map in order to become familiar with the major roads and junctures that lie ahead.

xxleyi / learning_list