Closed satyamtg closed 4 years ago
Thank you much for this. The second option is the most realistic at this point where we have something working and want to polish it instead of restarting it it. The third option I don't like as it makes it non-generic.
As we know, PHZH has custom CSS applied to the courses, and we need to find a way to deal with it. We also have several bugs in the PHZH zim due to this. So, after going through the instance courses for some instances including PHZH, I think we can take some paths from here. But before actually discussing them, I think it may be better if we go through the following points (would make it easier to understand the ways we may take) -
student_view_url
in the JSON), or the LMS URL (lms_web_url
in the JSON). The learner sees the LMS version. The scraper currently uses the xblock version. Also, as far as I have observed, xblock URLs are pretty much free from the extra CSS and JS. On the other hand, some instances like PHZH have put extra custom CSS and JS in the LMS version (which is the cause of most bugs pointed out by @Popolechien in other issues at the moment). This CSS and JS, as far as I have observed are present in the following places -vert
(in LMS version only) - The extra CSS for these classes are defined in the CSS in the headers. These are not applied on xblocks version as individual low level xblocks are not wrapped in thevert
div in the xblocks view, and thevert
div on the xblocks view for higher level xblocks like sequential or vertical do not have those extra classes. #74, #80 and #84 are due to missing extra classes.xblock
for low level xblocks only. But the CSS doesn't show the effect it seems that the CSS on thevert
div does the trick (at lest for the problem xblock). Here's an example of extra class for CSS on a vert div -The xblock version has the same div without the extra
frigg*
classes for the higher level xblocks. For lower level xblocks, this div is not at all present. Also thexblock
div is contained in this div.So, coming to the ways we can solve the problem, I think we can proceed with this in the following ways - Option 1: Scraping from the LMS web URLs The scraper currently scrapes the data from the individual xblock URLs, which at times do not contain the custom CSS and JS from the instance. So, the first method that comes into the mind is to scrape the LMS urls instead of the xblock URLs. The pros and cons of this method would be as follows -
Pros -
Cons -
Option 2: Keep the current system but make it add extra CSS and JS So, in this method what we can do is have the xblocks HTML extracted from the xblocks URL itself, but detect certain parts of extra CSS and JS to be added to the templates automatically. So, this would basically involve extra calls to the LMS web URL at the vertical xblock level and add scrape headers for CSS, and the end part of the body for JS. However, it seems that there are also certain portions of the div with class vert in the LMS HTML which contain extra CSS classes that apply to their inner HTML. So we also need to copy that. Another thing that's required here is a blacklist of CSS files to not download as we may have that already, for instance, say MathJax. The pros and cons of going with this approach would be as follows -
Pros -
Cons -
Option 3: Have templates for specific instances The last option that I can think of is having multiple templates for different instances, say there's a template for PHZH, another for edX etc. This would also require us to have a new parameter which would accept which template set to use. However, for classes on the div with the vert class, we would still need those calls to LMS URL at the vertical level.This also comes with its own pros and cons -
Pros -
Cons -
I suggest we go with the second option as it basically allows us to have the proper layout in the course content, in spite of not matching the source exactly (in terms of other components like the nav and sidenav). Also, we would be able to continue with the current codebase with no drastic changes. To me it seems to be a balance between the aesthetics and the usability of the scraper on different instances (though there might be an instance that has things implemented differently than I have so far noticed). The third option is also doable in my opinion.
@rgaudin @kelson42 @dattaz @Popolechien what are your views?