Closed C0D3D3V closed 6 years ago
Selbst identische Html Dateien werden öfters heruntergeladen
first step done https://github.com/C0D3D3V/Moodle-Crawler/tree/issue-5
only partially solved
New idea: make a text diff of html files. only if the hash of the text is different recrawl it ... Additionally save the hash in the log (with path) to not recrawl it if the file was moved See https://github.com/C0D3D3V/Moodle-Crawler/tree/issue-5b for progress
Vorschlag nur bei wichtigen Änderungen die HTML Datei neu laden oder HTML Dateien komplett ignorieren.