coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.93k stars 641 forks source link

In favour numeric selection of course to download #221

Closed phonx closed 9 years ago

phonx commented 9 years ago

Was there any good reason of removing this features cut and paste of course url is very annoying and prone to error

Usage of section need more documentation

Also any chance of having a history of what is already downloaded - flat file, csv or sqlite should be sufficient??? stored locally so user can view or export to excell for the records :)

iemejia commented 9 years ago

There are three possible ways to identify a course:

  1. by number from the list. This is probably the worst approach since the default order of the list changes, edx presents first the last courses you visited, so it wouldn't be reliable (in particular if you use edx-dl from another script), even if we order the courses once you add a new course who goes between two, you break the id for the other courses.
  2. by its short name The short name of the course, e.g. 'edX/DemoX.1/2014' could be used for this, but we lose the information of the instituion who is part of the URL, so this would make the -x option mandatory to work, and this could also be 'error-prone' as you said for example if you let -x stanford but you pass the id of an edX course.
  3. by URL (actual approach) This approach does not lose information, and it is easy to identify from the webpage, the user can do a right-click in any course link in the courses dashboard and paste it into the command line. If you think about it it is less error-prone to do select all->copy-> paste of the URL than to go and cut a part of it and paste it (the case 2). However I agree that it seems less 'user friendly' but no way more error-prone.

We eventually might accept the short-names (2) but I don't see a great improvement, with names like e.g. 'course-v1:TUMx+AUTONAVx+2T2015' it is really easy to make a mistake if the user types it and if he is going to copy/paste it, it makes for me more sense that he copy/pastes the full url than just a part of it.

iemejia commented 9 years ago

Usage of section need more documentation

Documentation is one of the areas where you can help as and contribute, your pull-requests are welcome.

iemejia commented 9 years ago

Also any chance of having a history of what is already downloaded - flat file, csv or sqlite should be sufficient??? stored locally so user can view or export to excell for the records :)

Saving user state goes beyond the goal of this script, in addition, it will make the maintenance of the same more difficult and since we are short on contributors I don't think it is a good idea for the moment. Sorry.

phonx commented 9 years ago

Hi iemejia ;

I thank you for your always prompt and useful feedback. Documentation oh boy I'll need to read the whole program that would take me sometimes which I don't always have but I'll try.

how about flatfile for history of download on local machine that just append a sorted recorded of each sucessful download of a whole course or sections. Coursera download seem to able to stop duplicated download does edx downloader have that feature??

iemejia commented 9 years ago

Hi,

Yes, avoid downloading twice is a good improvement to have, but we have a more difficult time to do so than coursera-dl/coursera has because we cannot know in advance the correct size of some resources to see if they have been correctly downloaded (e.g. the ones downloaded by youtube-dl, or even the subtitles one (which we have to transform inside).

Notice that the downloads done with youtube-dl automatically resume (or skip), so even if you make the costly reconnections, you never download the resources twice (with the exception of the ones in PR #146).