TechnionYP5779 / team4

29 stars 2 forks source link

Automatically download exams from webcourse #169

Closed menhel closed 5 years ago

menhel commented 5 years ago

For a given course number, download all exams and formal solutions. The output will then be given to the sorting script to complete the job.

0xYuval commented 5 years ago

@menhel, do you remember what is the name of the similar Chrome plug-in?

ron4548 commented 5 years ago

image @yuvalron

0xYuval commented 5 years ago

@ron4548 This plug-in rates the courses, it is not what I am talking about

0xYuval commented 5 years ago

Here is an example of a script that receives a webcourse URL and downloads all pdf files into a directory. Since it is written in python I do not know how to commit this script into the existing Java project:

from bs4 import BeautifulSoup
from urllib.request import urlopen
import os
import re

# extract exam details to contruct file name
def createFileName(file_url):
    l = file_url.split("/")
    course_number = l[3]
    exam = l[-1]
    return course_number + "-" + exam

# download the file from url into specific directory
def download(file_url, directory):
    response = urlopen(file_url)
    data = response.read()
    filename = directory + "/" + createFileName(file_url)
    file_ = open(filename, 'wb')
    file_.write(data)
    file_.close()

# get all pdf links in a webpage
def getLinks(webcourse_url):
    html_page = urlopen(webcourse_url)
    soup = BeautifulSoup(html_page, "html.parser")
    webcourse = "https://webcourse.cs.technion.ac.il"
    links = []
    for link in soup.findAll('a'):
        links.append(link.get('href'))
    res = [(webcourse + link) for link in links if '.pdf' in link] # only pdf
    return res

# download all pdf files in a webpage
def downloadLinks(webcourse_url, directory):
    if not os.path.exists(directory):
        os.makedirs(directory)
    links = getLinks(webcourse_url)
    for link in links:
        download(link, directory)

downloadLinks("https://webcourse.cs.technion.ac.il/236521/Winter2017-2018/hw.html", "C:/Users/yuval/Desktop/236521")
0xYuval commented 5 years ago

175