Each device has a globally distinct IP address, which is a 32 bit number. Usually an IP address is represented as a sequence of four decimal numbers, each number in the range (0, 255). For example, when I checked the IP address for my laptop just now, it was Any IP address beginning with 141.211 is for a device at the University of Michigan. When I take my laptop home and connect to a network there, my laptop gets a different IP address that it uses there. Data is chopped up into reasonable sized packets (up to 65,535 bytes, but usually much smaller). Each data packet has a header that includes the destination IP address. Each packet is routed independently, getting passed on from one computing device to another until it reaches its destination. The computing devices that do that packet forwarding are called routers. Each router keeps an address table that says, when it gets a packet for some destination address, which of its neighbors should it pass the packet on to. The routers are constantly talking to each other passing information about how they should update their routing tables. The system was designed to be resistant to any local damage. If some of the routers stop working, the rest of the routers talk to each other and start routing packets around in a different way so that packets still reach their intended destination if there is some path to get there. It is this technical capability that has spawned metaphoric quotes like this one from John Gilmore: “The Net interprets censorship as damage and routes around it.” At the destination, the packets are reassembled into the original data message.

24.3. Anatomy of URLs A URL is used by a browser or other program to specify what server to connect to and what page to ask for. Like other things that will be interpreted by computer programs, URLs have a very specific formal structure. If you put a colon in the wrong place, the URL won’t work correctly. The overall structure of a URL is:

://:/ Usually, the scheme will be http or https. The s in https stands for “secure”. When you use https, all of the communication between the two devices is encrypted. Any devices that intercepts some of the packets along the way will be unable to decrypt the contents and figure out what the data was. Other schemes that you will sometimes see include ftp (for file transfer) and mailto (for email addresses). The host will usually be a domain name, like si.umich.edu or github.com or google.com. When the URL specifies a domain name, the first thing the computer program does is look up the domain name to find the 32-bit IP address. For example, right now the IP adddress for github.com is This could change if, for example, github moved its servers to a different location or contracted with a different Internet provider. Lookups use something called the Domain Name System, or DNS for short. Changes to the mapping from domain names to IP addresses can take a little while to propagate: if github.com announces a new IP address associated with its domain, it might take up to 24 hours for some computers to start translating github.com to the new IP address. Alternatively, the host can be an IP address directly. This is less common, because IP addresses are harder to remember and because a URL containing a domain name will continue to work even if the remote server keeps its domain name but moves to a different IP address. The :port is optional. If it is omitted, the default port number is 80. The port number is used on the receiving end to decide which computer program should get the data that has been received. We probably will not encounter any URLs that include the : and a port number in this course. The /path is also optional. It specifies something about which page, or more generally which contents, are being requested. For example, consider the url https://github.com/presnick/runestone: https:// says to use the secure http protocol github.com says to connect to the server at github.com, which currently maps to the IP address The connection will be made on the default port, which is 443 for https. /presnick/runestone says to ask the remote server for the page presnick/runestone. It is up to the remote server to decide how to map that to the contents of a file it has access to, or to some content that it generates on the fly. The url http://blueserver.com/path?k=val is another example that we can consider. The path here a bit different from https://github.com/presnick/runestone because it includes what are called “query parameters”, the information after the ?.
d = {'q': '"violins and guitars"', 'tbm': 'isch'} results = requests.get("https://google.com/search", params=d) print(results.url)


import requests

page = requests.get("https://api.datamuse.com/words?rel_rhy=funny")

kval_pairs = {'rel_rhy': 'funny'} page = requests.get("https://api.datamuse.com/words", params=kval_pairs) print(page.text[:150]) # print the first 150 characters print(page.url) # print the url that was fetched


import statements for necessary Python modules

import requests

def get_rhymes(word): baseurl = "https://api.datamuse.com/words" params_diction = {} # Set up an empty dictionary for query parameters params_diction["rel_rhy"] = word params_diction["max"] = "3" # get at most 3 results resp = requests.get(baseurl, params=params_diction)

return the top three words

word_ds = resp.json()
return [d['word'] for d in word_ds]
return resp.json() # Return a python object (a list of dictionaries in this case)


['money', 'honey', 'sunny']

import requests def requestURL(baseurl, params = {}):

This function accepts a URL path and a params diction as inputs.

# It calls requests.get() with those inputs,
# and returns the full URL of the data you want to get.
req = requests.Request(method = 'GET', url = baseurl, params = params)
prepped = req.prepare()
return prepped.url

print(requestURL(some_base_url, some_params_dictionary))

Fortunately, the response object returned by requests.get() has the .url attribute, which will help you with debugging. It’s a good practice during program development to have your program print it out. This is easier than calling requestURL() but is only available to you if requests.get() succeeds in returning a Response object.

import requests dest_url = d = resp = requests.get(dest_url, params = d) print(resp.url) print(resp.text[:200])

import requests_with_caching //request.get은 매번 할때 마다 요청해야 함으로 캐시에 넣어놨다가 정보 쓸때(처음 코드 작성시 유용)

it's not found in the permanent cache

res = requests_with_caching.get("https://api.datamuse.com/words?rel_rhy=happy", permanent_cache_file="datamuse_cache.txt") print(res.text[:100])

this time it will be found in the temporary cache

res = requests_with_caching.get("https://api.datamuse.com/words?rel_rhy=happy", permanent_cache_file="datamuse_cache.txt")

This one is in the permanent cache.

res = requests_with_caching.get("https://api.datamuse.com/words?rel_rhy=funny", permanent_cache_file="datamuse_cache.txt")

new; adding to cache [{"word":"nappy","score":703,"numSyllables":2},{"word":"snappy","score":698,"numSyllables":2},{"word found in page-specific cache found in permanent_cache

import requests import json

PERMANENT_CACHE_FNAME = "permanent_cache.txt" TEMP_CACHE_FNAME = "this_page_cache.txt"

def _write_to_file(cache, fname): with open(fname, 'w') as outfile: outfile.write(json.dumps(cache, indent=2))

def _read_from_file(fname): try: with open(fname, 'r') as infile: res = infile.read() return json.loads(res) except: return {}

def add_to_cache(cache_file, cache_key, cache_value): temp_cache = _read_from_file(cache_file) temp_cache[cache_key] = cache_value _write_to_file(temp_cache, cache_file)

def clear_cache(cache_file=TEMP_CACHE_FNAME): _write_to_file({}, cache_file)

def make_cache_key(baseurl, params_d, private_keys=["api_key"]): """Makes a long string representing the query. Alphabetize the keys from the params dictionary so we get the same order each time. Omit keys with private info.""" alphabetized_keys = sorted(params_d.keys()) res = [] for k in alphabetized_keys: if k not in private_keys: res.append("{}-{}".format(k, paramsd[k])) return baseurl + "".join(res)

def get(baseurl, params={}, private_keys_to_ignore=["api_key"], permanent_cache_file=PERMANENT_CACHE_FNAME, temp_cache_file=TEMP_CACHE_FNAME): full_url = requests.requestURL(baseurl, params) cache_key = make_cache_key(baseurl, params, private_keys_to_ignore)

Load the permanent and page-specific caches from files

permanent_cache = _read_from_file(permanent_cache_file)
temp_cache = _read_from_file(temp_cache_file)
if cache_key in temp_cache:
    print("found in temp_cache")
    # make a Response object containing text from the change, and the full_url that would have been fetched
    return requests.Response(temp_cache[cache_key], full_url)
elif cache_key in permanent_cache:
    print("found in permanent_cache")
    # make a Response object containing text from the change, and the full_url that would have been fetched
    return requests.Response(permanent_cache[cache_key], full_url)
    print("new; adding to cache")
    # actually request it
    resp = requests.get(baseurl, params)
    # save it
    add_to_cache(temp_cache_file, cache_key, resp.text)
    return resp
import requests_with_caching import json

parameters = {"term": "Ann Arbor", "entity": "podcast"} iTunes_response = requests_with_caching.get("https://itunes.apple.com/search", params = parameters,permanent_cache_file="itunes_cache.txt")

py_data = json.loads(iTunes_response.text)

import requests_with_caching import json

parameters = {"term": "Ann Arbor", "entity": "podcast"} iTunes_response = requests_with_caching.get("https://itunes.apple.com/search", params = parameters, permanent_cache_file="itunes_cache.txt")

py_data = json.loads(iTunes_response.text) for r in py_data['results']: print(r['trackName'])

found in permanent_cache Ann Arbor Stories | Ann Arbor District Library Vineyard Church of Ann Arbor Sermon Podcast Harvest Mission Community Church (Ann Arbor) Sermons Grace Bible Church Ann Arbor Grace Ann Arbor Church Sermons from First Pres Antioch Ann Arbor Blue Ocean Faith Ann Arbor Sunday Sermons It’s Hot In Here Radiant Church - Ann Arbor: Sermons Calvary Sunday Messages Fellow Youths | Ann Arbor District Library Behind The Marquee | Ann Arbor District Library Ann Arbor SPARK CEO Podcast Bethel AME - Ann Arbor Sermons – NewLifeA2.org Ann Arbor West Side UMC Sermons Martin Bandyke Under Covers | Ann Arbor District Library Grace Ann Arbor Podcast Mosaic Church of Ann Arbor A2 City News Presenting Alfred Hitchcock Presents | Ann Arbor District Library Redeemer Ann Arbor Zion Lutheran Ann Arbor Living Writers 2|42 Community Church - Ann Arbor

ActiveCode (ac27_9_2)

import statements

import requests_with_caching import json

import webbrowser

apply for a flickr authentication key at http://www.flickr.com/services/apps/create/apply/?

paste the key (not the secret) as the value of the variable flickr_key

flickr_key = 'yourkeyhere'

def get_flickr_data(tags_string): baseurl = "https://api.flickr.com/services/rest/" params_diction = {} params_diction["api_key"] = flickr_key # from the above global variable params_diction["tags"] = tags_string # must be a comma separated string to work correctly params_diction["tag_mode"] = "all" params_diction["method"] = "flickr.photos.search" params_diction["per_page"] = 5 params_diction["media"] = "photos" params_diction["format"] = "json" params_diction["nojsoncallback"] = 1 flickr_resp = requests_with_caching.get(baseurl, params = params_diction, permanent_cache_file="flickr_cache.txt")

Useful for debugging: print the url! Uncomment the below line to do so.

print(flickr_resp.url) # Paste the result into the browser to check it out...
return flickr_resp.json()

result_river_mts = get_flickr_data("river,mountains")

Some code to open up a few photos that are tagged with the mountains and river tags...

photos = result_river_mts['photos']['photo'] for photo in photos: owner = photo['owner']

found in permanent_cache https://api.flickr.com/services/rest/?api_key=yourkeyhere&tags=river%2Cmountains&tag_mode=all&method=flickr.photos.search&per_page=5&media=photos&format=json&nojsoncallback=1 https://www.flickr.com/photos/45934971@N07/44858440865 https://www.flickr.com/photos/145056248@N07/43953569330 https://www.flickr.com/photos/145056248@N07/43953448610 https://www.flickr.com/photos/131540074@N08/44857602655 https://www.flickr.com/photos/145056248@N07/44857423045

import json

import webbrowser

apply for a flickr authentication key at http://www.flickr.com/services/apps/create/apply/?
paste the key (not the secret) as the value of the variable flickr_key

flickr_key = 'yourkeyhere'

def get_flickr_data(tags_string): baseurl = "https://api.flickr.com/services/rest/" params_diction = {} params_diction["api_key"] = flickr_key # from the above global variable params_diction["tags"] = tags_string # must be a comma separated string to work correctly params_diction["tag_mode"] = "all" params_diction["method"] = "flickr.photos.search" params_diction["per_page"] = 5 params_diction["media"] = "photos" params_diction["format"] = "json" params_diction["nojsoncallback"] = 1 flickr_resp = requests_with_caching.get(baseurl, params = params_diction, permanent_cache_file="flickr_cache.txt")

Useful for debugging: print the url! Uncomment the below line to do so.

print(flickr_resp.url) # Paste the result into the browser to check it out...
return flickr_resp.json()

result_river_mts = get_flickr_data("river,mountains")

Some code to open up a few photos that are tagged with the mountains and river tags...

photos = result_river_mts['photos']['photo'] for photo in photos: owner = photo['owner'] photo_id = photo['id'] url = 'https://www.flickr.com/photos/{}/{}'.format(owner, photo_id) print(url)
