Closed asciid closed 4 years ago
Stepik is made with Django and it doesn't respond clearly on some 404's and on all the 403's. It gives a wrapper page but server's status is still 200.
So I needed to get page's status.
I parsed for:
<section class="course-promo__head">
In responce's text but I hadn't expected another type of cources's class. There are actually syllabus and promo.
I thought it's a bad idea to depend on such subtle way of parsing so now I take page's title.
With incorrect 404's it is:
Stepik > 404
With 403's:
Stepik
And normally:
Course Name -- Stepik
And I have no need in special function to grep a status. Brilliant!
Issue is closed.
Very rarely during the scan process tool skips some links. Maybe the problem is in the way HTML is being parsed so I have to try
beautifulsoup
.I will post here verbose exploration of an issue.