Open woneuy01 opened 4 years ago
24.3. Anatomy of URLs A URL is used by a browser or other program to specify what server to connect to and what page to ask for. Like other things that will be interpreted by computer programs, URLs have a very specific formal structure. If you put a colon in the wrong place, the URL won’t work correctly. The overall structure of a URL is:
d = {'q': '"violins and guitars"', 'tbm': 'isch'} results = requests.get("https://google.com/search", params=d) print(results.url)
https://www.google.com/search?q=%22violins+and+guitars%22&tbm=isch
import requests
kval_pairs = {'rel_rhy': 'funny'} page = requests.get("https://api.datamuse.com/words", params=kval_pairs) print(page.text[:150]) # print the first 150 characters print(page.url) # print the url that was fetched
import requests
def get_rhymes(word): baseurl = "https://api.datamuse.com/words" params_diction = {} # Set up an empty dictionary for query parameters params_diction["rel_rhy"] = word params_diction["max"] = "3" # get at most 3 results resp = requests.get(baseurl, params=params_diction)
word_ds = resp.json()
return [d['word'] for d in word_ds]
return resp.json() # Return a python object (a list of dictionaries in this case)
print(get_rhymes("funny"))
['money', 'honey', 'sunny']
import requests def requestURL(baseurl, params = {}):
# It calls requests.get() with those inputs,
# and returns the full URL of the data you want to get.
req = requests.Request(method = 'GET', url = baseurl, params = params)
prepped = req.prepare()
return prepped.url
print(requestURL(some_base_url, some_params_dictionary))
Fortunately, the response object returned by requests.get() has the .url attribute, which will help you with debugging. It’s a good practice during program development to have your program print it out. This is easier than calling requestURL() but is only available to you if requests.get() succeeds in returning a Response object.
import requests
dest_url =
import requests_with_caching //request.get은 매번 할때 마다 요청해야 함으로 캐시에 넣어놨다가 정보 쓸때(처음 코드 작성시 유용)
res = requests_with_caching.get("https://api.datamuse.com/words?rel_rhy=happy", permanent_cache_file="datamuse_cache.txt") print(res.text[:100])
res = requests_with_caching.get("https://api.datamuse.com/words?rel_rhy=happy", permanent_cache_file="datamuse_cache.txt")
res = requests_with_caching.get("https://api.datamuse.com/words?rel_rhy=funny", permanent_cache_file="datamuse_cache.txt")
new; adding to cache [{"word":"nappy","score":703,"numSyllables":2},{"word":"snappy","score":698,"numSyllables":2},{"word found in page-specific cache found in permanent_cache
import requests import json
PERMANENT_CACHE_FNAME = "permanent_cache.txt" TEMP_CACHE_FNAME = "this_page_cache.txt"
def _write_to_file(cache, fname): with open(fname, 'w') as outfile: outfile.write(json.dumps(cache, indent=2))
def _read_from_file(fname): try: with open(fname, 'r') as infile: res = infile.read() return json.loads(res) except: return {}
def add_to_cache(cache_file, cache_key, cache_value): temp_cache = _read_from_file(cache_file) temp_cache[cache_key] = cache_value _write_to_file(temp_cache, cache_file)
def clear_cache(cache_file=TEMP_CACHE_FNAME): _write_to_file({}, cache_file)
def make_cache_key(baseurl, params_d, private_keys=["api_key"]): """Makes a long string representing the query. Alphabetize the keys from the params dictionary so we get the same order each time. Omit keys with private info.""" alphabetized_keys = sorted(params_d.keys()) res = [] for k in alphabetized_keys: if k not in private_keys: res.append("{}-{}".format(k, paramsd[k])) return baseurl + "".join(res)
def get(baseurl, params={}, private_keys_to_ignore=["api_key"], permanent_cache_file=PERMANENT_CACHE_FNAME, temp_cache_file=TEMP_CACHE_FNAME): full_url = requests.requestURL(baseurl, params) cache_key = make_cache_key(baseurl, params, private_keys_to_ignore)
permanent_cache = _read_from_file(permanent_cache_file)
temp_cache = _read_from_file(temp_cache_file)
if cache_key in temp_cache:
print("found in temp_cache")
# make a Response object containing text from the change, and the full_url that would have been fetched
return requests.Response(temp_cache[cache_key], full_url)
elif cache_key in permanent_cache:
print("found in permanent_cache")
# make a Response object containing text from the change, and the full_url that would have been fetched
return requests.Response(permanent_cache[cache_key], full_url)
else:
print("new; adding to cache")
# actually request it
resp = requests.get(baseurl, params)
# save it
add_to_cache(temp_cache_file, cache_key, resp.text)
return resp
import requests_with_caching import json
parameters = {"term": "Ann Arbor", "entity": "podcast"} iTunes_response = requests_with_caching.get("https://itunes.apple.com/search", params = parameters,permanent_cache_file="itunes_cache.txt")
py_data = json.loads(iTunes_response.text)
import requests_with_caching import json
parameters = {"term": "Ann Arbor", "entity": "podcast"} iTunes_response = requests_with_caching.get("https://itunes.apple.com/search", params = parameters, permanent_cache_file="itunes_cache.txt")
py_data = json.loads(iTunes_response.text) for r in py_data['results']: print(r['trackName'])
found in permanent_cache Ann Arbor Stories | Ann Arbor District Library Vineyard Church of Ann Arbor Sermon Podcast Harvest Mission Community Church (Ann Arbor) Sermons Grace Bible Church Ann Arbor Grace Ann Arbor Church Sermons from First Pres Antioch Ann Arbor Blue Ocean Faith Ann Arbor Sunday Sermons It’s Hot In Here Radiant Church - Ann Arbor: Sermons Calvary Sunday Messages Fellow Youths | Ann Arbor District Library Behind The Marquee | Ann Arbor District Library Ann Arbor SPARK CEO Podcast Bethel AME - Ann Arbor Sermons – NewLifeA2.org Ann Arbor West Side UMC Sermons Martin Bandyke Under Covers | Ann Arbor District Library Grace Ann Arbor Podcast Mosaic Church of Ann Arbor A2 City News Presenting Alfred Hitchcock Presents | Ann Arbor District Library Redeemer Ann Arbor Zion Lutheran Ann Arbor Living Writers 2|42 Community Church - Ann Arbor
ActiveCode (ac27_9_2)
import requests_with_caching import json
flickr_key = 'yourkeyhere'
def get_flickr_data(tags_string): baseurl = "https://api.flickr.com/services/rest/" params_diction = {} params_diction["api_key"] = flickr_key # from the above global variable params_diction["tags"] = tags_string # must be a comma separated string to work correctly params_diction["tag_mode"] = "all" params_diction["method"] = "flickr.photos.search" params_diction["per_page"] = 5 params_diction["media"] = "photos" params_diction["format"] = "json" params_diction["nojsoncallback"] = 1 flickr_resp = requests_with_caching.get(baseurl, params = params_diction, permanent_cache_file="flickr_cache.txt")
print(flickr_resp.url) # Paste the result into the browser to check it out...
return flickr_resp.json()
result_river_mts = get_flickr_data("river,mountains")
photos = result_river_mts['photos']['photo'] for photo in photos: owner = photo['owner']
found in permanent_cache https://api.flickr.com/services/rest/?api_key=yourkeyhere&tags=river%2Cmountains&tag_mode=all&method=flickr.photos.search&per_page=5&media=photos&format=json&nojsoncallback=1 https://www.flickr.com/photos/45934971@N07/44858440865 https://www.flickr.com/photos/145056248@N07/43953569330 https://www.flickr.com/photos/145056248@N07/43953448610 https://www.flickr.com/photos/131540074@N08/44857602655 https://www.flickr.com/photos/145056248@N07/44857423045
import json
flickr_key = 'yourkeyhere'
def get_flickr_data(tags_string): baseurl = "https://api.flickr.com/services/rest/" params_diction = {} params_diction["api_key"] = flickr_key # from the above global variable params_diction["tags"] = tags_string # must be a comma separated string to work correctly params_diction["tag_mode"] = "all" params_diction["method"] = "flickr.photos.search" params_diction["per_page"] = 5 params_diction["media"] = "photos" params_diction["format"] = "json" params_diction["nojsoncallback"] = 1 flickr_resp = requests_with_caching.get(baseurl, params = params_diction, permanent_cache_file="flickr_cache.txt")
print(flickr_resp.url) # Paste the result into the browser to check it out...
return flickr_resp.json()
result_river_mts = get_flickr_data("river,mountains")
photos = result_river_mts['photos']['photo'] for photo in photos: owner = photo['owner'] photo_id = photo['id'] url = 'https://www.flickr.com/photos/{}/{}'.format(owner, photo_id) print(url)
Each device has a globally distinct IP address, which is a 32 bit number. Usually an IP address is represented as a sequence of four decimal numbers, each number in the range (0, 255). For example, when I checked the IP address for my laptop just now, it was 141.211.203.248. Any IP address beginning with 141.211 is for a device at the University of Michigan. When I take my laptop home and connect to a network there, my laptop gets a different IP address that it uses there. Data is chopped up into reasonable sized packets (up to 65,535 bytes, but usually much smaller). Each data packet has a header that includes the destination IP address. Each packet is routed independently, getting passed on from one computing device to another until it reaches its destination. The computing devices that do that packet forwarding are called routers. Each router keeps an address table that says, when it gets a packet for some destination address, which of its neighbors should it pass the packet on to. The routers are constantly talking to each other passing information about how they should update their routing tables. The system was designed to be resistant to any local damage. If some of the routers stop working, the rest of the routers talk to each other and start routing packets around in a different way so that packets still reach their intended destination if there is some path to get there. It is this technical capability that has spawned metaphoric quotes like this one from John Gilmore: “The Net interprets censorship as damage and routes around it.” At the destination, the packets are reassembled into the original data message.