Coursera professors often re-organize course content while a course is running and this leads to changes in topic / unit names, which leads to coursera-dl downloading the same content again under different names.
This change is aimed to detect renames and reuse already downloaded files. Directory renames are not detected at this moment.
A separate deduplication project
I've also created a separate project for removing duplicates in already downloaded content. This one is capable of detecting directory renames. Feel free to include this in your project or just put a link in your project description.
https://github.com/ilfats/dedup.git
This code uses normalized file name and size to detect renamed files. Coursera web site seems to not report content length for small files (e.g. subtitles), so renames are not detected for these.
File rename detection
Coursera professors often re-organize course content while a course is running and this leads to changes in topic / unit names, which leads to coursera-dl downloading the same content again under different names.
This change is aimed to detect renames and reuse already downloaded files. Directory renames are not detected at this moment.
A separate deduplication project
I've also created a separate project for removing duplicates in already downloaded content. This one is capable of detecting directory renames. Feel free to include this in your project or just put a link in your project description. https://github.com/ilfats/dedup.git