C0D3D3V / Moodle-Downloader

A Moodle Crawler that downloads course content from Moodle (eg. lecture pdfs)
GNU General Public License v3.0
28 stars 4 forks source link

Recursive download bug #18

Open ghost opened 5 years ago

ghost commented 5 years ago

Some files (.pdf) are each stored in new folders. Sometimes these folders with the file in it are saved in the folder downloaded before.

As an example here is one directory of one course:

On Moodle: github2

On disk after download: . └── HMI-Ü-6 ├── HMI-Ü-7-► │ ├── HMI-Ü-8-► │ │ ├── HMI-Ü-9-► │ │ │ ├── HMI-Ü-10-► │ │ │ │ ├── HMI-Ü-11-► │ │ │ │ │ ├── HMI-Ü-12-► │ │ │ │ │ │ └── HMI_Ü_12.pdf │ │ │ │ │ └── HMI_Ü_11.pdf │ │ │ │ └── HMI_Ü_10.pdf │ │ │ └── HMI_Ü_9.pdf │ │ └── HMI_Ü_8.pdf │ └── HMI_Ü_7.pdf └── HMI_Ü_6.pdf

C0D3D3V commented 5 years ago

Can you please upload the source code of the whole webpage (from the page where the recrusion happens) to pastebin.
And maybe upload the log file on pastbin with loglevel = 5

And did you try to set antirecrusion = true in the config?

ghost commented 5 years ago

With antirecrusion turned on it looks even more interesting:

. ├── HMI-Ü-12 │ ├── HMI-Ü-13-► │ │ └── HMI_Ü_13.pdf │ └── HMI_Ü_12.pdf ├── HMI-Ü-2 │ ├── HMI-Ü-3-► │ │ └── HMI_Ü_3.pdf

│ └── HMI_Ü_2.pdf ├── HMI-Ü-4 │ ├── HMI-Ü-5-► │ │ └── HMI_Ü_5.pdf │ └── HMI_Ü_4.pdf ├── HMI-Ü-6 │ ├── HMI-Ü-7-► │ │ └── HMI_Ü_7.pdf │ └── HMI_Ü_6.pdf ├── HMI-Ü-8 │ ├── HMI-Ü-9-► │ │ └── HMI_Ü_9.pdf │ └── HMI_Ü_8.pdf └── HMI-Ü-10 ├── HMI-Ü-11-► │ └── HMI_Ü_11.pdf └── HMI_Ü_10.pdf

File 1 and 12 is gone. Source code comes tomorrow.

C0D3D3V commented 5 years ago

Interesting. Okay, I'm waiting for the source code.

I know that the anti-recrustation algorithm is not being implemented perfectly at the moment. But so far it has done a good job. I will revise it this weekend so that it will hopefully work in general.

It's complex to stop the recrusion, but to guess the correct folder structure, so it may take some time to fix this problem.

C0D3D3V commented 4 years ago

I created a new downloader, that should fix this issue... check it out here https://github.com/C0D3D3V/Moodle_Downloader_2