TRoboto / datacamp-downloader

Download your completed courses on Datacamp easily!
MIT License
178 stars 52 forks source link

Introduce V3.0 #39

Closed TRoboto closed 2 years ago

TRoboto commented 2 years ago

Changes

This PR introduces major changes to the library. These changes are summarized as follows:

I guess I didn't miss any big change. I would like to hear comments about this before merging. Any feedback would be greatly appreciated.

Test

To test the changes, it is recommended to create a new virtual environment first, then clone the v3.0 repo with:

pip install git+https://github.com/TRoboto/datacamp-downloader.git@v3.0

Now run datacamp as per the README file.

TODO before merging

jorritvm commented 2 years ago

Hey, I did some functional testing of the V3 update:

The refactored login with selenium now works for me, I tried it using set-token:

D:\dev\python\datacamp_downloader_3\venv\Scripts
(venv) λ datacamp set-token ...
INFO: Hi, Jorrit
INFO: Active subscription found

Then courses seems to work out too, although there are many completed courses that Datacamp no longer provides:

D:\dev\python\datacamp_downloader_3\venv\Scripts
(venv) λ datacamp courses
+-----+--------+------------------------------------------+------------+------------+------------+
| #   | ID     | Title                                    | Datasets   | Exercises  | Videos     |
+-----+--------+------------------------------------------+------------+------------+------------+
| 1   | 735    | Introduction to Python                   | 2          | 46         | 11         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 2   | 58     | Introduction to R                        | 0          | 62         | 0          |
+-----+--------+------------------------------------------+------------+------------+------------+
| 3   | 799    | Intermediate Python                      | 3          | 69         | 18         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 4   | 1532   | Python Data Science Toolbox (Part 1)     | 1          | 34         | 12         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 5   | 672    | Intermediate R                           | 0          | 67         | 14         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 6   | 22639  | Joining Data with pandas                 | 23         | 37         | 15         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 7   | 4914   | Introduction to the Tidyverse            | 1          | 34         | 16         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 8   | 1531   | Python Data Science Toolbox (Part 2)     | 2          | 34         | 12         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 9   | 1607   | Introduction to Importing Data in Python | 9          | 39         | 15         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 10  | 13369  | Writing Efficient Python Code            | 1          | 38         | 15         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 11  | 24364  | Cleaning Data in Python                  | 5          | 31         | 13         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 12  | 1606   | Intermediate Importing Data in Python    | 3          | 22         | 7          |
+-----+--------+------------------------------------------+------------+------------+------------+
| 13  | 24558  | Object-Oriented Programming in Python    | 0          | 31         | 13         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 14  | 15876  | Writing Functions in Python              | 0          | 31         | 15         |
+-----+--------+------------------------------------------+------------+------------+------------+
ERROR: Cannot get course with id: 1008.
| 15  | 5355   | Introduction to Git                      | 0          | 46         | 0          |
+-----+--------+------------------------------------------+------------+------------+------------+
ERROR: Cannot get course with id: 774.
ERROR: Cannot get course with id: 723.
| 16  | 16719  | Streamlined Data Ingestion with pandas   | 3          | 37         | 16         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 17  | 7355   | Web Scraping in Python                   | 1          | 39         | 17         |
+-----+--------+------------------------------------------+------------+------------+------------+
ERROR: Cannot get course with id: 753.
| 18  | 616    | Data Analysis in R, the data.table Way   | 0          | 27         | 10         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 19  | 15974  | Unit Testing for Data Science in Python  | 0          | 38         | 17         |
+-----+--------+------------------------------------------+------------+------------+------------+
ERROR: Cannot get course with id: 944.
| 20  | 13203  | Software Engineering for Data Scientists | 0          | 36         | 15         |
|     |        | in Python                                |            |            |            |
+-----+--------+------------------------------------------+------------+------------+------------+
| 21  | 20680  | Building Web Applications with Shiny in  | 4          | 45         | 16         |
|     |        | R                                        |            |            |            |
+-----+--------+------------------------------------------+------------+------------+------------+
| 22  | 5323   | Data Manipulation with data.table in R   | 0          | 44         | 15         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 23  | 25074  | Reshaping Data with pandas               | 4          | 37         | 15         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 24  | 1143   | Time Series Analysis in R                | 0          | 42         | 16         |
+-----+--------+------------------------------------------+------------+------------+------------+
| 25  | 14630  | Writing Efficient Code with pandas       | 3          | 31         | 14         |
+-----+--------+------------------------------------------+------------+------------+------------+
ERROR: Cannot get course with id: 1057.
| 26  | 5882   | Joining Data with data.table in R        | 8          | 34         | 13         |
+-----+--------+------------------------------------------+------------+------------+------------+

I was able to download course 1:

D:\dev\python\datacamp_downloader_3\venv\Scripts
(venv) λ datacamp download 1
INFO: [1/1] Start to download (1) Introduction to R

Downloading [chapter 0] [==================================================] 100%
Downloading [chapter 3] [==================================================] 100%
Downloading [chapter 4] [==================================================] 100%
Downloading [chapter 5] [================                                  ] 32%
Aborted!

However, other courses failed, some worse than others...

D:\dev\python\datacamp_downloader_3\venv\Scripts
(venv) λ datacamp download 21
ERROR: Cannot get course with id: 21.

D:\dev\python\datacamp_downloader_3\venv\Scripts
(venv) λ datacamp download 13
ERROR: Cannot get course with id: 13.

D:\dev\python\datacamp_downloader_3\venv\Scripts
(venv) λ datacamp download 22
INFO: [1/1] Start to download (22) None
Traceback (most recent call last):
  File "C:\Program Files\Python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\dev\python\datacamp_downloader_3\venv\Scripts\datacamp.exe\__main__.py", line 7, in <module>
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\typer\main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\click\core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\typer\main.py", line 497, in wrapper
    return callback(**use_params)  # type: ignore
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\datacamp_downloader\downloader.py", line 154, in download
    datacamp.download(
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\datacamp_downloader\datacamp_utils.py", line 44, in wrapper
    return f(*args, **kwargs)
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\datacamp_downloader\datacamp_utils.py", line 216, in download
    self.download_course(material, path, **kwargs)
  File "d:\dev\python\datacamp_downloader_3\venv\lib\site-packages\datacamp_downloader\datacamp_utils.py", line 248, in download_course
    index + correct_path(course.slug or course.title.lower().replace(" ", "-"))
AttributeError: 'NoneType' object has no attribute 'lower'
TRoboto commented 2 years ago

Thank you for providing great functional test. You should use the id of the course(s) with the download command instead of the sequential numbering. Please try the download command again with the course ID and let me know the result. I will look into showing a warning for that.

If you have already completed the old courses and been able to download them, I think this should be fine. Otherwise, I have to see where the problem lies.

jorritvm commented 2 years ago

Hi,

Ok, I should have been more careful and mindfull of your documentation when doing my test. It is a bit weird though that it accepted 1 as an ID even though none of the courses on my completed list have this ID. I tried it with the proper course ID and it works for me.

D:\dev\python\datacamp_downloader_3\venv\Scripts
(venv) λ datacamp download 20680
INFO: [1/1] Start to download (20680) Building Web Applications with Shiny in R
Downloading [datasets] [==================================================] 100%
Downloading [chapter1.pdf] [==================================================] 100%
Downloading [ch1_1.mp4] [==================================================] 100%
Downloading [ch1_2.mp4] [==================================================] 100%
Downloading [ch1_3.mp4] [==================================================] 100%
Downloading [chapter 1] [==================================================] 100%
Downloading [chapter2.pdf] [==================================================] 100%
Downloading [ch2_1.mp4] [==================================================] 100%
Downloading [ch2_2.mp4] [==================================================] 100%
Downloading [ch2_3.mp4] [==================================================] 100%
Downloading [ch2_4.mp4] [==================================================] 100%
Downloading [chapter 2] [==================================================] 100%
Downloading [chapter3.pdf] [==================================================] 100%
Downloading [ch3_1.mp4] [==================================================] 100%
Downloading [ch3_2.mp4] [==================================================] 100%
Downloading [ch3_3.mp4] [==================================================] 100%
Downloading [ch3_4.mp4] [==================================================] 100%
Downloading [chapter 3] [==================================================] 100%
Downloading [chapter4.pdf] [==================================================] 100%
Downloading [ch4_1.mp4] [==================================================] 100%
Downloading [ch4_2.mp4] [==================================================] 100%
Downloading [ch4_3.mp4] [==================================================] 100%
Downloading [ch4_4.mp4] [==================================================] 100%
Downloading [ch4_5.mp4] [==================================================] 100%
Downloading [chapter 4] [==================================================] 100%

I have found that course videos, exercises, markdown files, datasets are all downloaded successfully.

Additionally I tested the tracks command, it seems to work too for me.

D:\dev\python\datacamp_downloader_3\venv\Scripts
(venv) λ datacamp tracks
+-----+--------+------------------------------------------+------------+
| #   | ID     | Title                                    | Courses    |
+-----+--------+------------------------------------------+------------+
| 1   | t1     | R Programming                            | 2          |
+-----+--------+------------------------------------------+------------+
| 2   | t2     | Python Fundamentals                      | 4          |
+-----+--------+------------------------------------------+------------+
| 3   | t3     | Importing & Cleaning Data  with Python   | 5          |
+-----+--------+------------------------------------------+------------+
| 4   | t4     | Python Programming                       | 6          |
+-----+--------+------------------------------------------+------------+
TRoboto commented 2 years ago

No worries, that's a problem on my end, there is a bug that will be fixed soon. I will also default the sequential numbering, which is more convenient in my opinion, in the download command and will remove the ID column.

Once again thank you for the tests.