Closed lsoliveira459 closed 10 years ago
Thanks very much for this. Will look to go through it and merge by the weekend.
Thanks again. Had a quick try but fails for me with the "did you accept the honour code" error on one of my classes (malsoftware-001). Works fine if I switch back to my master branch. Also, why the extra TOKEN_URL and why hardcoded to a specific class (ml)?
I'm assuming you accepted the honour code. Is there anything special on this course that would trigger this? I'm a little clue-less about this error since I don't recall fooling around this exception.
Getting to TOKEN_URL gives us access to that "csrf_token". It's hardcoded simply to avoid having to find a URL dynamically (a solution I provided commented-out). This ML class, just as informative comment, was the first class offered by coursera.
There is nothing special about that course. Looking closely it turns out that coursera replies with "Please use a modern browser with JavaScript enabled to use Coursera." Sporadically I also get:
line 259, in get_page page = response.content AttributeError: 'NoneType' object has no attribute 'content'
Switching back to my master branch and it all works fine. So there is a difference with how mechanize makes requests and the requests lib which coursera.org is not liking. At least for me on OSX and linux with python 2.7.3.
Wrt the token_url, I get that but still dont see the need for it. Why not use the class name as passed by the user (as per the original code). This prevents possible breakage if the ml class disappears or gets renamed.
There is nothing special about that course. Looking closely it turns out that coursera replies with "Please use a modern browser with JavaScript enabled to use Coursera." Sporadically I also get:
line 259, in get_page page = response.content AttributeError: 'NoneType' object has no attribute 'content'
I'm a bit busy right now too so I'll to look into that a little later. Maybe next weekend.
Switching back to my master branch and it all works fine. So there is a difference with how mechanize makes requests and the requests lib which coursera.org is not liking. At least for me on OSX and linux with python 2.7.3.
I wouldn't say it's something about coursera but just in case I'll identify the requests as coming from Firefox or something and check if there's anything the request leaves behind after closing the handler. If that's an OS problem I can't see how I could look into it.
"Wrt the token_url, I get that but still dont see the need for it. Why not use the class name as passed by the user (as per the original code). This prevents possible breakage if the ml class disappears or gets renamed."
And what about the user mistyping the first class' name? The commented code I included, repeated below with a few notes, gets a JSON with information to all the classes (found by inspection) and slowly loads it (128 bytes at a time) searching for a link to a class' page. I thought this to be the most robust approach, but also a bit slower. I left it as a contingency plan for the case you mentioned.
# Estabilish a keep-alive connection
all_classes_json = requests.get(CLASSES_URL,stream=True)
if(int(all_classes_json.status_code) == 200):
str = ''
# Create a iterator on the connection to retrieve 128 bytes at a time
it = all_classes_json.iter_content(128)
while 1:
str += it.next()
JSON_TOKEN = '"preview_link"'
try:
# Getting indexes that contain the URL we want
i1 = str.index(JSON_TOKEN)
i2 = str.index('"',i1+len(JSON_TOKEN)+2)
except:
# In case the information was not loaded yet, grab more 128 and retry
continue
else:
# All's fine
TOKEN_URL = str[i1+len(JSON_TOKEN)+2:i2]
all_classes_json.close()
break
else:
print 'Please make sure you are connected to the internet.'
I replicated the error. I'll look into it now.
Have you seen this? (https://www.facebook.com/Coursera/posts/439625136155480)
closed after merging #100
In an attempt to monitor whether the program halted or not, I used Requests instead of Mechanize to download manage the connections. Using Requests, I added a small progress bar to help monitor if the download progress halted.
It is completely working in Windows 8, but I couldn't test it on W7 or Linux. Below is a small proof it's working.