The project structure can be made better by making it similar to newer scrapers such as TED and youtube. We would need to do a couple of changes to achieve this.
Currently, the scraper uses openedx2zim as an entry point and then basically has everything done in mooc.py. It also uses docopt (which was last maintained in 2018 and the last PyPI version being released in 2014) for getting arguments. We even do not have a place where we have a unified version number stored. Considering the current structure, I think it would be better to have it modified to something similar to youtube/ted with the openedx2zim being moved to openedx2zim/entrypoint.py, the mooc.py be moved to openedx2zim/scraper.py, having a constants.py and a VERSION file, along with a CHANGELOG. Also we would need an entrypoint for Docker as this makes use of youtube-dl sometimes and that needs to be updated very frequently.
This should not be much difficult given that the current structure is quite modular but I think this should be done so that we share a common structure across scrapers (at least the new ones)
The project structure can be made better by making it similar to newer scrapers such as TED and youtube. We would need to do a couple of changes to achieve this.
Currently, the scraper uses openedx2zim as an entry point and then basically has everything done in mooc.py. It also uses docopt (which was last maintained in 2018 and the last PyPI version being released in 2014) for getting arguments. We even do not have a place where we have a unified version number stored. Considering the current structure, I think it would be better to have it modified to something similar to youtube/ted with the openedx2zim being moved to openedx2zim/entrypoint.py, the mooc.py be moved to openedx2zim/scraper.py, having a constants.py and a VERSION file, along with a CHANGELOG. Also we would need an entrypoint for Docker as this makes use of youtube-dl sometimes and that needs to be updated very frequently.
This should not be much difficult given that the current structure is quite modular but I think this should be done so that we share a common structure across scrapers (at least the new ones)