Closed mraleksander closed 3 years ago
TLDR:
Moodle-dl calls a bunch of interfaces to collect information about the current state of your Moodle courses, which it then compares with the local database to decide which data to download and which data has already been downloaded.
Moodle-dl creates a database called moodle_state.db
which contains all the files that Moodle-dl downloaded or processed in general. Additionally a kind of history is created in this database.
You can view and edit this file with any SQL-Lite database viewer.
The decision when to download a file is mainly based on whether it is already in the database.
Moodle itself identifies a file by an ID and a version which results in a unique link. This is true in most cases. It is even possible that several courses refer to the same file (same link). The Moodle downloader would still download the file in each course, because an additional criterion is the path to the file, which is determined by the section and various other hierarchies in the course.
Another general criterion is the type of file, i.e. whether it is a normal resource, a link, a description, a link in a description, a file in a database, or whatever else I have categorized. The last important criterion is the timestamp, which is set at each file resource.
The mechanism to detect added, modified, moved and deleted files is relatively simple but has now reached a certain complexity.
You can find the mechanism here: https://github.com/C0D3D3V/Moodle-Downloader-2/blob/c913368ca1efae4a4d8f8fdd217c42170059e774/moodle_dl/state_recorder/state_recorder.py#L416
Before the method is called, data/information about all selected courses is downloaded via the various interfaces (see here) and stored in the list current_courses
. The mechanism then compares this just downloaded data collection with the information stored in the database.
The database is updated instantly once a file is downloaded. This means that even after an interruption right after a download, the downloader would know that the file has already been downloaded.
Actually the architecture has become too simple in the meantime, the downlaoder has become so big that it would have to be reworked.
Hope this answers your question.
How does the program store which file it already downloaded so it doesn't download them again. And dose it check for server side changes of already downloaded files