Open CicadaCinema opened 1 month ago
These 3 first commits of mine bring this project up to the state with which I performed my own archive on 10 July (this archive was checked thoroughly for errors/omissions), with the following exceptions:
webdriver.Firefox()
rather than webdriver.Chrome()
driver.get()
could not be reliably trusted to exit when the page finishes loading, I felt safer increasing the argument of each sleep()
call to somewhere between 50 and 60 (seconds), because I can justify stepping away from the computer while the archive runs.Now the bad news is that:
Now the bad news is that:
~I have no desire to install Chrome so I will continue testing/developing locally using Firefox;~ (seems like ungoogled chromium works for whatever reason, so ignore this)
I felt that it was more convenient to have a post-processing step in bash, since I know sed better than Python's regular expression facilities
I can see that Crowdmark has already been updated, so some changes are required as of now to make this script function properly.
I've addressed these points by porting the bash code to Python and by doing a bit more testing with ungoogled chromium and the current version of Crowdmark.
Broadly, the changes in this PR are as follows:
sorry been a bit busy. looking rn
I added some more commits. but still haven't been able to get this working completely. I don't have much time to work on this. However, I'll link this PR in the original repo
Fixes https://github.com/curtischong/crowdmark-downloader/issues/4