coursera-dl / edx-dl

A simple tool to download video lectures from edx.org (and other openedx sites)
GNU Lesser General Public License v3.0
1.93k stars 640 forks source link

Don't replace spaces with _ in folder and filenames #615

Open JohnVeness opened 4 years ago

JohnVeness commented 4 years ago

🚨Please review the Troubleshooting section before reporting any issue. Don't forget also to check the current issues to avoid duplicates.

Subject of the issue

This is my personal preference, but I prefer if you didn't replace spaces with _ in folder and filenames.

Your environment

Steps to reproduce

  1. edx-dl -s -u <censored> https://courses.edx.org/courses/course-v1:MITx+6.002.1x+2T2019/course/ --filter-section 3
  2. Wait for it to download all the videos and subtitles
  3. Observe the downloaded folder and filenames

Expected behaviour

"Nice" looking folder names, such as "Circuits and Electronics 1- Basic Circuit Analysis" and "03-Math Review".

Actual behaviour

"Messy" looking folder names, such as "Circuits_and_Electronics_1-_Basic_Circuit_Analysis" and "03-Math_Review".

I realise this is a personal preference so should maybe be controlled by a command-line option for those who prefer the current behaviour.

A quick fix would be removing the line s = s.strip().replace(' ', '_') in clean_filename. Alternatively, I notice that that function can accept a second argument of True, which as well as not replacing spaces, would also not replace brackets and other things, which would personally suit me. As far as I can see, that second argument isn't used in the calling code, though.