nileshsah / harwest-tool

A one-shot tool to harvest submissions from different OJs onto one single VCS managed repository http://bit.ly/harwest
MIT License
130 stars 15 forks source link

[Feature Request] Crawl submissions in gym & virtual contest #7

Open ngthanhtrung23 opened 3 years ago

ngthanhtrung23 commented 3 years ago

These submissions require login. Using requests.session, login should be possible. I've hacked around and this login method works:

    def __login(self):
        username = 'I_love_Hoang_Yen'
        password = '<redacted>'
        bfaa = 'f1b3f18c715565b589b7823cda7448ce'
        ftaa = ''.join(random.choices('abcdefghijklmnopqrstuvwxyz0123456789', k=18))
        LOGIN_URL = 'https://codeforces.com/enter'
        r = self.session.get(LOGIN_URL)
        csrf = r.text.split("csrf_token' value='")[1].split("'")[0]

        data = {
            "csrf_token": csrf,
            "action": "enter",
            "ftaa": ftaa,
            "bfaa": bfaa,
            "handleOrEmail": username,
            "password": password,
            "_tta": "176",
            "remember": "on",
        }
        r = self.session.post(LOGIN_URL, data=data, headers={'X-Csrf-Token': csrf})

After that it's also necessary to modify submission URL (for contest ID > 100k, should be /gym/{contest_id}/submission/{submission_id}.

nileshsah commented 3 years ago

This is simply awesome @ngthanhtrung23! Thanks for providing the starting points, will try to integrate this in the crawling flow.