dgorissen / coursera-dl

A script for downloading course material (video's, pdfs, quizzes, etc) from coursera.org
http://dirkgorissen.com/2012/09/07/coursera-dl-a-coursera-download-script/
GNU General Public License v3.0
1.73k stars 299 forks source link

File Names have become url encoded #98

Closed eadlam closed 10 years ago

eadlam commented 10 years ago

The video names have all of a sudden become semi url encoded. for example, yesterday this video downloaded normally:

1 - 1 - Course Introduction About the Course (1223).mp4

Today, rather than skipping it because it was already downloaded, it started re-downloading as:

120-20120-20Course20Introduction3A20About20the20Course2028123A2329.mp4

dgorissen commented 10 years ago

Mmm thats annoying. Will take a look at it over the weekend.

capatillo commented 10 years ago

Can't wait, fast fix - work for me: in file utils.py after line 12 g = m.group(1) add this line

g = urllib2.unquote(g)
danmbox commented 10 years ago

I'm seeing this too. It's due to coursera, right? I've looked at some recent courses (downloaded with 1.5) and I see no url-encoded files.

joedicastro commented 10 years ago

The solution of @capatillo works for me! Thanks! :+1:

dgorissen commented 10 years ago

Please issue a pull request and I will merge it On 4 Oct 2013 07:37, "capatillo" notifications@github.com wrote:

Can't wait, fast fix - work for me: in file utils.py after line 12 g = m.group(1) add this line

g = urllib2.unquote(g)

— Reply to this email directly or view it on GitHubhttps://github.com/dgorissen/coursera-dl/issues/98#issuecomment-25678996 .

capatillo commented 10 years ago

I only know how to pull, but don't know how to pull request, sorry. Only one line in utils.py after line 12: if ("%" in g): g = urllib2.unquote(g)

joedicastro commented 10 years ago

@capatillo you can do a pull request by editing the file directly here in GitHub, try it, is easy.

danmbox commented 10 years ago

Some of us are reluctant to "fork me on github" just to be able to submit a patch (it's github's fault). And you can't attach files to issues. But you can include diff snippets in a comment easily by using a fenced code block (see http://github.github.com/github-flavored-markdown/); be sure to use diff -urN orig new to generate the diff

--- /tmp/util.py.orig   2013-10-04 10:18:40.741994978 +0300
+++ /tmp/util.py    2013-10-04 10:19:00.729459356 +0300
@@ -10,6 +10,7 @@
         pattern = 'attachment; filename="(.*?)"'
         m = re.search(pattern, cd)
         g = m.group(1)
+        if ("%" in g): g = urllib2.unquote(g)
         return sanitise_filename(g)
     except Exception:
         return ''

(with the risks that my recent edits to this comment demonstrate in case you submit a borked patch)

capatillo commented 10 years ago

@joedicastro Thanks, hope I did it right. I watched in help - but found about fork, branch etc, from command line it's not easy too, but directly here it was fast and easy. https://github.com/dgorissen/coursera-dl/pull/99

joedicastro commented 10 years ago

@capatillo Yes, you did it right, thanks to you for the fast fix! :ok_hand:

eadlam commented 10 years ago

Wow, thank you everyone! That was very fast.