coiby / edx-downloader

A simple tool to download video lectures from edx.org.
17 stars 10 forks source link

KeyError: u'href #1

Open kcw-022 opened 9 years ago

kcw-022 commented 9 years ago

Hi, I kept encountering the following KeyError when running then standalone windows version. Not sure what's wrong there?

Traceback (most recent call last): File "", line 650, in File "", line 417, in main File "D:\PyInstaller-2.1\edx-dl\build\edx-dl\out00-PYZ.pyz\bs4.element", line 905, in getitem KeyError: u'href'

.

ghost commented 9 years ago

same problem

coiby commented 9 years ago

@kevw22 @Hasset Sorry for the late reply. Could you tell me which course? I can't replicate this problem.

ghost commented 9 years ago

Hi @coiby , the error encountered in initiation.

coiby commented 9 years ago

@Hasset Thank you!

According to the info you provide, there's something wrong with parsing out article[class='course']. Normally, COURSE should look like this:

<article class="course honor">
<section class="details">
<div aria-hidden="true" class="wrapper-course-image">
<a class="cover" href="/courses/McGillX/Body101x/1T2015/info">
<img alt="Body101x The Body Matters Home Page" class="course-image" src="/c4x/McGillX/Body101x/asset/Body101x_thumbnail.jpeg"/>
</a>
...
</div>
</section>
...
</article>

Could you print the content in COURSE?

for COURSE in COURSES:
        c_name = COURSE.h3.text.strip()
        print(COURSE) #add this before line 417
        c_link = BASE_URL + COURSE.a['href']
        if c_link.endswith('info') or c_link.endswith('info/'):
            state = 'Started'
        else:
            state = 'Not yet'
        courses.append((c_name, c_link, state))
    numOfCourses = len(courses)
ghost commented 9 years ago

Hi @coiby ,

coiby commented 9 years ago

@Hasset I'm sorry, but what I mean is that you add debugging code and run the program again to print COURSE.

ghost commented 9 years ago

Hi @coiby , I don't know how to run the debugging code.

ghost commented 9 years ago

Hi @coiby , I got it.

coiby commented 9 years ago

@Hasset Thanks for your feedback. I've confirmed this bug. It's because there's no hyperlink, i.e., no href attribute for some courses which haven't started yet. A temporarily solution is to unenroll out of that kind of courses. But I'll fix this bug before tomorrow night.

ghost commented 9 years ago

@coiby That's a very good news! Hurry up! :D

coiby commented 9 years ago

@Hasset I've adopted a solution from iemejia/edx-downloader. But only edx-dl.py has been updated. Standalone packages will be updated a few days later.

ghost commented 9 years ago

@coiby I've tested. It runs smoothly. Thanks!