openzim / openedx

Open edX (to zim) scraper
GNU General Public License v3.0
8 stars 7 forks source link

Moving forward with custom CSS/JS in instances #90

Closed satyamtg closed 4 years ago

satyamtg commented 4 years ago

As we know, PHZH has custom CSS applied to the courses, and we need to find a way to deal with it. We also have several bugs in the PHZH zim due to this. So, after going through the instance courses for some instances including PHZH, I think we can take some paths from here. But before actually discussing them, I think it may be better if we go through the following points (would make it easier to understand the ways we may take) -

So, coming to the ways we can solve the problem, I think we can proceed with this in the following ways - Option 1: Scraping from the LMS web URLs The scraper currently scrapes the data from the individual xblock URLs, which at times do not contain the custom CSS and JS from the instance. So, the first method that comes into the mind is to scrape the LMS urls instead of the xblock URLs. The pros and cons of this method would be as follows -

Pros -

Cons -

Option 2: Keep the current system but make it add extra CSS and JS So, in this method what we can do is have the xblocks HTML extracted from the xblocks URL itself, but detect certain parts of extra CSS and JS to be added to the templates automatically. So, this would basically involve extra calls to the LMS web URL at the vertical xblock level and add scrape headers for CSS, and the end part of the body for JS. However, it seems that there are also certain portions of the div with class vert in the LMS HTML which contain extra CSS classes that apply to their inner HTML. So we also need to copy that. Another thing that's required here is a blacklist of CSS files to not download as we may have that already, for instance, say MathJax. The pros and cons of going with this approach would be as follows -

Pros -

Cons -

Option 3: Have templates for specific instances The last option that I can think of is having multiple templates for different instances, say there's a template for PHZH, another for edX etc. This would also require us to have a new parameter which would accept which template set to use. However, for classes on the div with the vert class, we would still need those calls to LMS URL at the vertical level.This also comes with its own pros and cons -

Pros -

Cons -

I suggest we go with the second option as it basically allows us to have the proper layout in the course content, in spite of not matching the source exactly (in terms of other components like the nav and sidenav). Also, we would be able to continue with the current codebase with no drastic changes. To me it seems to be a balance between the aesthetics and the usability of the scraper on different instances (though there might be an instance that has things implemented differently than I have so far noticed). The third option is also doable in my opinion.

@rgaudin @kelson42 @dattaz @Popolechien what are your views?

rgaudin commented 4 years ago

Thank you much for this. The second option is the most realistic at this point where we have something working and want to polish it instead of restarting it it. The third option I don't like as it makes it non-generic.