Romern / syncMyMoodle

Synchronization client for RWTH Moodle
GNU General Public License v3.0
73 stars 18 forks source link

Speedup Sciebo link finding #121

Open D-VR opened 4 months ago

D-VR commented 4 months ago

Currently the following code (around line 1130), may check a lot of duplicate links (up to 6x or higher! in one of my courses), leading to significant slowdowns due to unnecessary get requests. Adding some simple caching can improve this.

# https://rwth-aachen.sciebo.de/s/XXX
        if self.config.get("used_modules", {}).get("url", {}).get("sciebo", {}):
            sciebo_links = re.findall(
                "https://rwth-aachen.sciebo.de/s/[a-zA-Z0-9-]+", text
            )

            for vid in sciebo_links:
                response = self.session.get(vid)
                soup = bs(response.text, features="html.parser")
                url = soup.find("input", {"name": "downloadURL"})
                filename = soup.find("input", {"name": "filename"})
                if url and filename:
                    parent_node.add_child(
                        filename["value"], url["value"], "Sciebo file", url=url["value"]
                    )

I will create something when I have time