hackatbrown / apis

Let's bring Brown's data into the 21st century.
http://api.students.brown.edu
12 stars 2 forks source link

Course Scraper Improvement: Move headers into class, out of methods #29

Closed hpincket closed 8 years ago

hpincket commented 8 years ago

Another quick task. I think there are three places in the scraper where I define a 'header' dict which is used for making http requests to selfservice. Notice that 3/4 of the headers are the same in all 3 methods (Only the 'Referer' header is different). It would be nice to keep those other three headers in one place associated with the class. Then the methods can basically add the 'Referer' header before making their request. This will shorten our code and make it easier to maintain.

Current:

  headers = {
            'User-Agent': "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/43.0.2357.130 Chrome/43.0.2357.130 Safari/537.36",
            'Referer': url,
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
            'Origin': 'https://selfservice.brown.edu'
        }
        r = self.s.post(url, data=payload, headers=headers)

Possible New:

my_headers = SelfserviceSession.headers
my_headers['Referer'] = url
r = self.s.post(url, data=payload, headers=my_headers)

I know very little about python Object Oriented programming. Should it be static? private? idk!

jbrower95 commented 8 years ago

My tentative solution is in 4b6fd3bab87c7202a1a76a43a549c611c8a6c009. We should be deepcopy'ing a dictionary that callers then use to form a request. They get this dictionary by using a method which performs the deepcopy.